More

hobbyist · on Oct 13, 2015

There are no cmp, branch instructions. How is this turing complete?

pdq · on Oct 13, 2015

This is a partial ALU and Register File. It is not a CPU.

Alupis · on Oct 13, 2015

> There are no cmp, branch instructions. How is this turing complete?

I see no claim for it to be?

Rebellos · on Oct 13, 2015

CPU by common sense should be Turing complete.

wyldfire · on Oct 13, 2015

All you need is logical AND, logical OR, and logical NOT. I suppose they could compose them from the operations there.

hobbyist · on April 24, 2015

I think Makefile have such pattern matching, which can be utilized to do such list transformations too.

hobbyist · on Feb 12, 2015

How differently is linear regression than PCA? I understand the procedure and methods are completely different, but isn't linear regression also going to give the same solution on these data sets?

vicapow · on Feb 12, 2015

PCA minimizes the error along the PC axis. Here's a nice article that goes into more detail: http://www.r-bloggers.com/principal-component-analysis-pca-v...

marcosdumay · on Feb 12, 2015

PCA results in a space, while linear regression results in a function within the same space as the original data.

im3w1l · on Feb 13, 2015

To begin with they solve different problems:

Linear regression has a notion of input and output. You want to model how the output depends on the input.

PCA does not. You have a pointcloud and want to find a compressed representation of it.

hobbyist · on Feb 5, 2015

I am guessing you went for a graduate degree in math. How hard was it to get admission? Would you like to recommend some good schools for doing something like you did. I am just a little younger but hungry for math knowledge.

hobbyist · on June 28, 2014

The author has won an IOCCC contest. This would have just pricked him a bit.

hobbyist · on Feb 9, 2014

Could you elaborate more or give some references on the second and third part? I am done with the first. I seriously need some profound knowledge on second and third, which a lot of people like you talk about. I need to put a plan to get there too. Scheme to C looks fun though :-) . Where should I start first with?

gamegoblin · on Feb 10, 2014

If you want to implement Scheme in C, I'd take a look at Scheme From Scratch (http://michaux.ca/articles/scheme-from-scratch-bootstrap-v0_...)

It's pretty excellent for learning how to implement a simple language in C.

pjmlp · on Feb 10, 2014

Do yourself a service and write the Scheme compiler in Scheme itself, doing direct assembly code generation.

http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf

Also a nice way to learn about bootstraping compilers.

Generating code code for another OS without Scheme support is just a matter of having your compiler cross-compile to the other OS. Then use the fresh backed compiler to re-compile your compiler in the new OS.

hobbyist · on Jan 3, 2014

I often read that spark avoids the costly synchronization required in mapreduce, since it uses DAG's. Can someone explain how is that achieved. If the application so demands that you can launch jobs together, that can be done even with hadoop/mapreduce. If one job requires the output of another, then the job has to wait for synchronization whether its mapreduce or DAG.

xtreme · on Jan 3, 2014

Spark's major benefit comes from storing the intermediate results in-memory instead of storing it in HDFS as Hadoop does. Let's say a certain query needs to run 3 mapreduce jobs A, B, C one after another. In Hadoop, there will be 3 hdfs reads and writes. With spark, there will be only 1 hdfs read (before launching A) and 1 write (after C is completed). In spark, the output of A gets stored in RAM which is read by B and so on until the final write.

The DAG used by spark represents how one job/partition of data depends on another job/partition and what methods (e.g. filter) need to be applied on the parent data to get the child data. This is useful when a node goes down and that portion of data has to be recomputed. Note that users can choose to persist some intermediate results to hdfs to avoid recomputation in case of failure.

res0nat0r · on Jan 3, 2014

I believe it is because Spark stores data in objects called RDD (Resilient Distributed Dataset) and these are lazily evaluated. I could be wrong though.

http://spark.incubator.apache.org/docs/latest/quick-start.ht...

hobbyist · on June 19, 2013

Good question. I did read the spark paper, and one reason that I found for spark doing so much better than hadoop was that it avoids the unnecessary serialization, deserialization which hadoop just can not avoid. The RDD's as mentioned by @rxin, are in memory objects and thus do not require frequent serialization/deserialization when multiple operations are being applied to data.

hobbyist · on April 3, 2013

I am not well-versed in the browser designs, could you highlight what hard problems are you referring to?

kibwen · on April 3, 2013

Check out the high-level overview of Servo's design:

https://github.com/mozilla/servo/wiki/Design

The goal is extremely pervasive concurrency in aspects that no modern engine has yet begun to approach (and likely could not approach without enormous effort and/or a full-on rewrite).

jerf · on April 3, 2013

You can drop the "likely". Retrofitting pervasive concurrency onto a sizable existing codebase is to a first approximation impossible. It makes merely trying to retrofit pervasive unit testing (hard, but "merely" a long mechanical slog) or correctness with string encoding a cake walk. "Impossible" here can be taken to read as "would require as much effort to do the retrofit as the rewrite would take".

pkulak · on April 3, 2013

> Copy-on-write DOM. In Servo, the DOM is a versioned data structure that is shared between the content (JavaScript) task and the layout task, allowing layout to run even while scripts are reading and writing DOM nodes.

Wow

shawn-butler · on April 3, 2013

>> Because C++ is poorly suited to preventing these problems,

It is really hard to take seriously any project that has this kind of nonsense in its introduction.

steveklabnik · on April 3, 2013

I think you're being downvoted because if anyone knows about the pain of using C++ to develop a browser engine, it's Mozilla. They have some pretty strong empirical evidence to back up their statement.

shawn-butler · on April 3, 2013

No, you don't bash a language for failures of the people using it. It is hyperbole unsupported by any evidence and it is precisely this mentality that keeps our profession on the Greatest New Thing(TM) every x years treadmill for better or worse.

Language bashing/trolling serves no purpose.

I am rather most likely being downvoted because HN in the past few months has taken an extreme downturn towards a slashdot/herd mentality, but that is just another pendulum swinging.

acdha · on April 4, 2013

> I am rather most likely being downvoted because HN in the past few months has taken an extreme downturn towards a slashdot/herd mentality

You're being downvoted because you clearly have no idea what you're talking about. It's entirely appropriate to criticize a language for being too hard to use correctly - your argument would apply equally well to saying that C++ should never have been created because it was simply the fault of people using earlier languages not using them sufficiently well.

This approaches dark comedy because you're also criticizing a company which maintains one of the largest and most important codebases in existence and has a huge list of bugs and security issues demonstrating that even in the hands of very experienced developers it's too easy to use C++ incorrectly.

shawn-butler · on April 5, 2013

Oh, so you think the statement that was quoted was that c++ was too hard to use? Because when you read what was actually quoted it says c++ is poorly suited to solve issues related to data races and parallelism which is a pretty false statement.

Are you able to see how that is different now with a little help? Or is it so insufficient to make up straw arguments and put words into my mouth that now you want to do the same for Mozilla foundation?

kragen · on April 4, 2013

C++ is the only realistic option for some things, but it's not without serious problems. Intelligent criticism of a formal system, such as a programming language, is what allows us to make better ones in the future.

If you honestly think there's no point in trying to make better programming languages, you're welcome to continue programming in FORTRAN IV. But don't expect other people to take your opinion seriously about what "serves no purpose".

shawn-butler · on April 4, 2013

Can you point to where I said any such thing?

The point was simple: language bashing/trolling is childish. It is hard to take seriously a project with that kind of statement in its introduction.

If instead it said Rust does X in order to achieve goal Y, that would be more useful.

kragen · on April 5, 2013

"Language bashing" is merely a derogatory term for "language criticism". Language criticism is, as I said, a necessary foundation for designing better languages; if existing languages are flawless, then designing new ones would simply be part of "the Greatest New Thing(TM) every x years treadmill" that you referred to in your initial comment. On the other hand, if you can identify real problems in existing languages, then you have some hope of designing new languages that are actually better, not just newer. But you can't do that without "language bashing". Without "language bashing", we'd still be back at FORTRAN IV. And if that's what you want, you can probably live in FORTRAN-world.

"Trolling" is bullshitting to provoke a response; your accusation there, if we take you at your word, would have to be that the Rust developers are developing a new programming language as a sort of hoax in order to get a rise out of C++ programmers.

I don't think it really serves your point well to suggest that Graydon Hoare, Brian Anderson, Sebastian Sylvan, Samsung, and so on, are "childish" and "hard to take seriously" and dishonest, because that requires us to choose between taking Graydon and Samsung et al. seriously and taking you, Shawn Butler, seriously. This is a competition that will be hard for you to win. Perhaps instead you could find a way to couch your criticism (whatever it is) in a way that makes it easier to accept. As it is, only people who have a pre-existing hate for Graydon or Samsung will be inclined to accept your argument.

shawn-butler · on April 6, 2013

You have failed to understand a one-line sentence and inflated it into some agenda entirely of your own creation and are attributing it falsely to me.

The contention was simple and utterly straightforward. Having a conjectural and value laden statement that is false and using it in an introduction gives me pause. Doing so persuades people holding surface knowledge (as evidenced by this comment thread) but gives people with experience a different response.

Now you are making an appeal to authority.

kragen · on April 7, 2013

I really don't think I'm the one who's failing here. If you'll excuse me, I have to get back to programming embedded systems in C++.

nnethercote · on April 3, 2013

No, you're being downvoted because fine-grained parallel programming in C++ is really hard, but you're saying that it's not.

shawn-butler · on April 4, 2013

Wow, I said what?

Also while I hate to be the harbinger of bad news, parallel programming in any language is Really Hard(tm) for most people. Personally, I think there's a fundamental disconnect in how human cognition perceives the world on the one hand and massive parallelism on the other at which very few people I have met really excel relative to the larger population.

abraininavat · on April 4, 2013

Are you claiming that all languages are equally good for all purposes? Or are you claiming that C++ is in fact particularly well suited for parallel development and inherent avoidance of data races?

Coming from a C++ developer, I don't think you're well aware of its limitations. Do yourself a favor and learn a little bit of Rust. It'll open up your mind a little, and just might make you a better C++ programmer.

shawn-butler · on April 4, 2013

The C++ memory model (which btw is what underlies any threading facility) was always sufficient for handling the complexity but it never offered any opinions or constraints on implementations until C++11; the standards committee wisely preferred to defer instead to the writers of libraries to provide appropriate designs and services tailored to the specific needs of user communities and particular operating system facilities.

I'm pretty sure based on your writing that you have little to offer me that I don't already understand about the language, but thanks for your hollow advice though. Here, I'll make an empty prediction in return: in 3 years you will be complaining about how difficult it is to do distributed parallelism in Rust and what absolute garbage it is compared to <insert name here>.

The point was simple but fanbois want straw men and windmills at which to tilt: if you have something you think is better, extol its virtues and provide comparative analysis instead of bashing/trolling what currently exists completely out of any useful context.

I really have little interest in discussing anything technical on HN anymore. Here's a token wikipedia link, that's what passes for knowledge I guess [0].

Also I would keep your "advice" to yourself. I certainly hope you wouldn't speak to people like that in person, and you definitely wouldn't be allowed to speak in such a fashion to me in particular. I am pretty familiar with Rust's evolving semantics and syntax, thanks.

[0] http://en.wikipedia.org/wiki/C%2B%2B11#Threading_facilities

Symmetry · on April 4, 2013

I'm note sure exactly what you mean by "the C++ memory model" but I'm pretty sure that it does not underlie Rust's model of threading unless you just mean that Rust assumes a Von Neuman architecture. First of all the stacks of Rust threads are segmented, unlike those of C++ threads, so there's one feature that cannot be duplicated by a C++ library.

But more importantly, Rust doesn't allow any shared mutable state, meaning that all the synchronization primitives C++11 provides can be used by the language automatically by the language on your behalf. Of course, nothing prevents someone from creating a library in C++ that does the same thing, but there's no way for the compiler to enforce the clean thread-wise separation of state and if you've forgotten to get rid of some pointer to an object you're passing to another thread you won't know about it potentially until you're debugging it in production. Also there are a number of threading optimizations that the Rust compiler can make with re-ordering or eliding memory accesses that the C++ compiler can't.

shawn-butler · on April 4, 2013

The bait statement was:

"Or are you claiming that C++ is in fact particularly well suited for parallel development and inherent avoidance of data races?"

I answered the c++ memory model was certainly sufficient and the implementation was left to libraries until C++11 when it was standardized. I did not mean to imply that the c++ memory model underlies that of Rust.

And I agree these are great points to illustrate. Would make some great text to use on their introduction page in place of the trolling. Still, I believe the same deficiency exists regarding compiler enforcement for Rust since the "unsafe" keyword allows for manual management, correct? I can't intelligently comment on the compiler optimizations to which you refer. If you could provide some reference to further analysis? Regardless, the two same two people are downvoting my comments so I won't be commenting further.

Although you clearly understand, I'll throw in the token links for people who may not understand what a memory model is or how the c++ memory model has been formally standardized. To be honest, I don't know why I bother.

[0]: http://en.wikipedia.org/wiki/Memory_model_(programming)

[1]: http://www.cl.cam.ac.uk/~pes20/cpp/popl085ap-sewell.pdf

abraininavat · on April 4, 2013

You come off as whiny and immature, which is why your comments keep getting voted down. I have no doubt you're under the impression they are getting voted down because most people disagree with you, but you're wrong.

Out of curiosity, who wouldn't allow me to speak to you in such a fashion? Who's this imaginary authority who controls how I speak to children?

eruditely · on April 3, 2013

This says more about you than the quality of the project.

shadowmint · on April 3, 2013

Are you kidding?

HTML is a massive spec even before you step in and implement the HTML5 javascript APIs.

Then you also have to make sure that your rendering engine behaves in a sane way, supports all the obscure CSS rules from the spec and from webkit...

There's a reason Opera stopped playing this game; it's hard work.

IgorPartola · on April 3, 2013

Why is it so massive? Why does it need to specify in grave detail the difference between <span>, <label>, <code>, etc.? How about a spec like this:

- There are two types of elements: block and inline. You can declare a name of an element with a particular type using perhaps XML namespaces, or just a JSON object.

- Inline elements can be floated left/right along other inline elements.

- Block elements may be positioned absolutely inside their parent elements or relatively to their current position. You cannot float them.

- Width/height may be specified as a pixel width, percentage width of parent, percentage of available space, or percentage of screen size. Additionally, you may specify the box model: whether to include borders, margins, padding, etc.

- CSS rules about typography, margins, borders, padding, etc. shall apply. This way, you can include your own basic rules and build on top of them.

I had the misfortune to do a bit of hacking with GTK+ and at first thought "what an archaic way to lay out elements?!" Then it came to me that HTML + CSS is not advanced, it is cluttered. There are many ways to position an element on the page, and they will conflict. Additionally, things like opacity affecting z-index, having a parent element have a size to give the child element a percentage size, etc. lead to a ton of hacks. It's time we have a better, cleaner tool than the browser if we are going to build serious apps on this platform.

kyrias · on April 3, 2013

Because if the spec didn't, every engine would do things their own way and no website would work in more than one browser.

IgorPartola · on April 3, 2013

We do need a spec, I'm not arguing against it. I am just tired of the fact that I can position elements via many conflicting methods:

- position: absolute/relative/fixed + top/left values

- float: left/right

- positive/negative margins

- float values of other elements

Yet cannot do simple things like tell a block element to take up all available height.

The spec focuses on various types of data that could be represented. For example we have a <code> tag. This is done in an attempt to be semantic. However, it fails at being comprehensive, and ends up falling back on things like <code class="python"> instead of <python>. The distinction between <code>, <var>, <span>, <label>, and other inline elements is completely arbitrary and which elements get to be first class citizens is also arbitrary. Giving up and saying that there are only <inline> and <block> elements would simplify things a whole lot. If you can then "subclass" a <block> to create a <p> element or subclass an <inline> element to make a <lable>, go for it!

nradov · on April 3, 2013

You have basically described XSL FO. In retrospect that's obviously what we should have used for web page layout instead of HTML, but now it's too late. http://www.w3.org/TR/xsl/

IgorPartola · on April 3, 2013

From the spec:

> Unlike the case of HTML, element names in XML have no intrinsic presentation semantics. Absent a stylesheet, a processor could not possibly know how to render the content of an XML document other than as an undifferentiated string of characters. XSL provides a comprehensive model and a vocabulary for writing such stylesheets using XML syntax.

So the big issue with XSL is that it's verbose as hell. I remember using XSL Transforms to do some really simple things, and getting it right was horrible. Debugging it was worse. Given a piece of code that uses HTML + CSS vs XSL, I'd pick HTML + CSS any day simply because it's more readable.

However, yes the core of it seems much better thought out than CSS.

> but now it's too late.

Is it? Is it possible to have some XSL FO to HTML5 + CSS compiler?

justincormack · on April 3, 2013

Not "a cleaner tool than the browser" a cleaner layout spec. Plus you will have to build a CSS compatibility layer on top. Hard but worthwhile. Sort of thing Adobe might work on.

IgorPartola · on April 3, 2013

> Not "a cleaner tool than the browser" a cleaner layout spec.

Yes, agreed. Though the next thing that we might want to tackle is the whole concept of a web page. Seems like storing application state in a URL is a terrible thing, yet it is so convenient for some use cases. This might mess with the idea of a "browser" more since you wouldn't be "browsing" applications, you'd be running them.

> Plus you will have to build a CSS compatibility layer on top. Hard but worthwhile.

Yes, definitely. I can't imagine anything like this taking off without a compatibility layer. However, I think, the compatibility layer could be just HTML, this new CSS replacement, some JavaScript, and a server-side compiler.

> Sort of thing Adobe might work on.

Are you being sarcastic?

justincormack · on April 3, 2013

Actually I wasn't entirely, they have a lot of experience on print rendering (PostScript, InDesign), and seem to have HTML interest now but are not really attached to CSS per se. Suspect however they are not...

saraid216 · on April 3, 2013

Backwards compatibility, I think?

ternaryoperator · on April 4, 2013

Not only do you have to implement the massive HTML spec, but you must implement incredibly forgiving error handling. Just about any crappy, illegal HTML must be processed without throwing an exception--and the browser has to render something from it.

paddy_m · on April 3, 2013

Creating a new language that handles concurrency and pointer bugs in a systematic way that is also fast is amazing. Writing a new browser engine in said language is very ambitious and important.

Offler · on April 3, 2013

Presumably to squeeze out maximum performance from multi-core machines. Current engines are all single thread per page instance.

hobbyist · on March 19, 2013

If Rust doesn't allow pointer arithmetic and conjuring up a pointer using '&' like in C, it doesn't make me feel there is anything special here.

pcwalton · on March 19, 2013

If you have a slice of a vector (`&[T]`), then you can also move it forward, like pointer arithmetic in C except checked to ensure that you don't overflow the bounds of the vector. Of course, if you have an unsafe pointer, then you can perform raw pointer arithmetic on it as well.

You can create references using `&`, although I didn't show it in this overview.

pjmlp · on March 19, 2013

Pointer arithmetic should be confined to unsafe/system blocks as it only makes sense when writing low level code like device drivers.