More

arthur_debert · on Dec 24, 2015

In Brazil's case, as least, it's simpler than that.

Mosquito eradication used to be done at the federal level, with military like organization. It kept things in control.

A few years ago it has been re-assigned to municipalities. Not only are local authorities less equipped to handle it, but mosquitoes do not conform to city boundaries. If one city is doing a good job but it's neighbor isn't you're screwed.

There's probably lots to be done through technical advances, but most of this could be avoided using common sense.

danieltillett · on Dec 24, 2015

It is not a case of do one or the other, but do both.

arthur_debert · on Dec 17, 2015

While you're right that as a judge you must respect the law, we as a society must re-evaluate it's usefulness and purpose.

You're looking at this the other way round. It's not that privacy makes it harder to foster a civilized society, but removing makes it totally impossible.

Is amuses me that you would see this as post Snowden or even American issue. As a Brazilian I know our state is much more corrupt and inapt, and the risks are an order of magnitude higher for us, Snowden or not.

See here for some of the rational: http://www.thoughtcrime.org/blog/we-should-all-have-somethin...

arthur_debert · on Aug 24, 2015

It's surprising how much of it relates to food delivery (the largest group, in fact).

Wrote my take at how the it's distributed here https://medium.com/@arthurdebert/2abb1df33f6d

arthur_debert · on July 3, 2015

Has nothing to do with Java. There's no stack. They're simply chose the best Javascript compiler available.

Pretty simple, really

arthur_debert · on May 12, 2015

Sorry. Here's the dissonance: we're building tools (including secure messaging) that are government snooping safe. We care deeply about the privacy and security of our users. And the web page that tells us that has mixed content warnings on ssl.

Maybe nitpicking, but security and OPSEC in particular requires insane focus and attention to every little detail. If you can trip up on the small things, I could hardly trust you to get the hard things right.

ChrisAntaki · on May 12, 2015

To be fair, the script is "http://localhost:35729/livereload.js". That shouldn't leak anything to an adversary, if the request doesn't leave the client's computer.

Still should be fixed though, so the HTTPS warning can serve its function and call out real threats.

arthur_debert · on May 12, 2015

You're 100% right. It's just that security is so hard to get right. Only (maybe not even) the paranoid survive on that front. All it takes is one tiny detail to screw everything up. Leaving development artifacts on your live server is not very tranquilizing on that front.

aral · on May 13, 2015

Indeed :) And thanks for the heads up, Arthur. Was a bit of debugging code left in by mistake. Fixed it when I was skimming these comments yesterday but haven’t had a chance to reply and say thanks until now :)

arthur_debert · on May 23, 2014

Anyone with any experience in the SEO world knows that dchuck doesn't have a fucking clue anymore. He's no different than a talking head reporting on the daily ups and downs of the stock market on your local news.

Ad hominem. If instead of saying "X is an idiot, anyone can tell " you'd say what's wrong with his argument, you wouldn't look like a talking head.

dchuk · on May 23, 2014

If you were active in the SEO world, you'd know what I said wasn't an Ad Hominem but instead an accurate description of his abilities.

Danny Sullivan does not actively "do SEO" anymore, a field that changes monthly. He reports on news that other people discover about happenings of the search engine industry.

Hence why he concluded that he didn't know what happened to MetaFilter. Because he has no fucking clue how modern SEO works anymore.

And it's dchuk.

arthur_debert · on May 23, 2014

Thanks, that's a start. And of course, sorry for the misspelling.

In all seriousness, it would be more helpful if you'd mention what's wrong with the original article.

Since you seem to be knowledgeable and active on the SEO front, would you mind explaining what happened to MetaFilter?

dwd · on May 23, 2014

The problem is SEO in general.

We have omnipresent and omniscient Google (or so they think) with high priest Matt Cutts bringing the tablets down from the mountaintop.

The SEO crew are on the ground busy reading chicken entrails and tea leaves and coming up with the best guesses they can but they really don't know.

arthur_debert · on March 23, 2014

Not always, i.e. drag along rights. Depending on how you structure your deal investors can override founders on acquisition offers.

icambron · on March 23, 2014

Good point, but I can't quite square that with the article, so I don't think that's what it has in mind.

arthur_debert · on Dec 12, 2013

Pagination does have it's advantages, namely: - Linking - Easier to grok where you are in the result set ( 3 out of 5 instead of the scroll bar that changes height)

randac · on Dec 12, 2013

Yeah, it works great in search results, much easier to understand for the average user...

Page 1: 1-20 of about 26... cool not too many to scan through

Page 2: 21-40 of about 46... oh

Page 3: 41-60 of about 67... errr

Page 4: 61-80 of about 87... seriously?

arthur_debert · on Dec 4, 2013

It is language and written language that enables people to be connected and share ideas. There's no reason to credit specific networks like www or the internet with connecting people, because someone else would have done it (possibly better) if they had not.

code_duck · on Dec 5, 2013

People credit facebook with connection people, when AOL did the same thing 10 years ago and someone else will be providing software for that in 10 years. People credit stack overflow with programmers answering each others questions, while Usenet did that 10 years ago and someone else will be helping with that 10 years from now.

Meanwhile, language has been around for thousands of years and is not quite comparable.

arthur_debert · on March 11, 2013

I'm definitely not advocating for people not understanding the problem they want solved.

That said, your post sounds empty. Can you elaborate on why your own scrapers that you write from scratch make it all better? How do you your scrapers deal with encode detection, broken html, content prioritization and so forth?

I don't like the current options we've got in pythonland, but just writing: "this sucks, so I write my own" sounds like an ego trip. Can you describe in detail what BeautifulSoup (or lxml which is usually a better option) is doing wrong at the lower level and how your scripts are making it better?

kysol · on March 11, 2013

Sorry if it sounded empty, there is a reason why I didn't include examples. I'm not really saying "don't use libraries", more just that you should understand the problem first before looking for an easy solution. To be honest, I've done all my scraping in PHP/Perl over the years. Only recently have I started to look into other options such as Python and NodeJS (hence looking at this thread).

I don't claim that my scrapers are better off because they are written from scratch, but they do the job that I want them to do. If I find a target that has a "quirk" I write that into my classes to be used then and in later instances. The real point of doing it this way is more about knowing what the scrapper is doing, rather than what it might do. When you're scraping, you're walking a fine line. Targets may be fine with you doing it to them, but as soon as your scraper freaks out then starts hammering the site, you're in trouble (even worse if you end up doing damage to the target).

I'm not saying that 3rd party libraries are prone to doing this, more so if you forget to set an option or handle an exception, you might screw yourself. If you wrote the scraper it's your own fault for not handling the issue properly. If you used a 3rd party library and the library bugged out causing the issue, you can't really go after the writers, right?

This all comes back to understanding your target, and to understand them, you need some form of knowledge on how it all works.

In response to your questions - I do a lot of things manually when setting up the scrapers. I don't import the data into any sort of DOM (due to watching memory), and in doing that I'm not really concerned about Encoding (for the record I'm generally dealing with UTF-8 and Shift_JIS only) or Broken HTML (I do a general check over the source to see if the layout has changed. If it has, it exits gracefully sending me update notifications on what changed, then puts itself out of action until I reset it. If it's a mission critical scraper, lets just say that I have a myriad of alerts that are sent to me). It's probably not the best way of doing things but it works for me.

Sorry if I was vague, I probably should have put some sort of rant-detection on my mouth. If I didn't answer something specifically, it's not that I was ignoring it, it probably just fell into the "I don't trust it so I don't use it" category. Again, not advocating that people shouldn't use 3rd party libraries, just that you should at least know what you are doing before you do.