Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

... you get a arrested for having a "very high score".

Considering this is PRECISELY how spam filtering works, it doesn't seem entirely irrational.

Much like spam filtering, it would all come down to dialing in your filters and picking a good threshold.

Hell, if the system was good enough, it could actually improve freedoms. We currently arrest many innocent people as part of the legal process (who are later exonerated). What if our "arrest filter" outperformed the current system, in terms of percentage of innocent people arrested? It doesn't have to be perfect to be better.



The problem with such a filter is it will be perceived as inefficient and broken by those evaluating it if it denies an officer the authority to arrest someone he's already detained or determined to be worthy of arrest.

It will be another system which grants permission at the whims of the department, one that absolves individual officers of blame, punishment and consequences for bias and abuse they'll continue to revel in.


There is no reason such a filter need be an authoritative source. Arrest someone it didn't suggest, if you like. Choose not to arrest someone it did suggest. Human discretion would still be applied. But if it proved accurate, officers would trust it, and the population would get upset if it identified a criminal an officer did not arrest.

No different than a virus or spam scanner, really. I trust my scanners. Sometimes they are wrong, and I know they are, and I bypass them. But I know they are right most of the time.

You seem to have the notion though that officers do not arrest people to "catch the bad guy". It sounds like you are saying you believe most officers do not actually care if the arrestee is guilty (and will thus ignore the filter always), and merely arrest for kicks/pleasure/vengeance? I do not believe the majority are like that.


At least in the US, officers have quotas of arrests to fulfill, which are most easily fulfilled with petty crime that happens all the time and is easy to find. Attorneys get promoted based on successful cases, meaning those where they could convince a jury that the person is guilty; any foresight that goes above or beyond what they can present to a jury of laymen is lost time and will get them recognized as being ineffective.


>You seem to have the notion though that officers do not arrest people to "catch the bad guy". It sounds like you are saying you believe most officers do not actually care if the arrestee is guilty (and will thus ignore the filter always), and merely arrest for kicks/pleasure/vengeance? I do not believe the majority are like that.

You seem to have the notion that I believe what you think I believe. No, I do not believe that officers are arresting people for kicks. I do believe that human bias, power, career building/justification and 'gut feelings' about who might be a bad guy perpetuate unequal application of the law and abuse in a policed society.

I do believe that if a filter sought to mitigate those very real problems it would be fought by the system it meant to augment. I am giving it the very forgiving assumption that those motivations and skews in perception aren't unknowingly built into the filter by its human creators, nor that biased information isn't fed into the filter. If either of those were true, it would be welcomed with open arms by police departments nationwide.

I do believe that such a filter will be perceived as broken or inefficient if it doesn't confirm the officers' preconceived notions about who is a criminal or worthy of arrest.

Unfortunately, I do not believe that the problem is 'the bad guys are getting away, and we need a system to find them'. It's rather the opposite, 'innocent people are having their lives interrupted and sometimes ruined because officers think they look suspicious for reasons and need to justify their paychecks; we need a system to mitigate that'.

I do not think a system that blinks a little arrest light if a suspect is Muslim and uses Google has any place in society, no matter how many times you cry 'b-but machine learning!! Spam filters!! Bayes!!'

Your filter would cement this period's problems into, in the eyes of the public, an infallible machine's instructions and will enable abuse without accountability, because you can always point to the machine and say you followed your best judgment.


Spam filtering works using actual data that someone once used to train it - if we are taking this at face value, it means that the NSA is not just targeting people who might be terrorists and collects data on other people as a side effect, but that it actively targets people with no ties to terrorism whatsoever, for reasons that both US and international public might find unsavory if they found out about them.

Spam filtering also works in a very different context - the spam-to-nonspam ratio is something like 90% spam and 10% nonspam, which means that there is lots and lots of spam to filter out; if an important email slips through, people are bound to notice and either adjust their spam filter or do something about it. In the other setting, you have 99.99% or more of people who have nothing to do with terrorism or criminal activities, and maybe one or two dozen (among tens of millions) who you are actually targeting. First thing, erroneously targeting a substantial chunk of your non-interesting population ties up resources - you're spending your time investigating people who are not terrorists - but since it's difficult anyways, at least you seem like you're doing something with all the money you receive, and nevermind if some of the data is used for industrial espionnage or hunting people that only poultry farmers and fracking magnates would call terrorists. And if you miss one of the two dozen other people, well, they won't do anything harmful this year or the next because they also have to fear regular law enforcement, and when they do it'll be in a moment that's probably suitable for you to ask for more money.

tl;dr: Because we don't have a large sample of actual terrorists on hand, it's hard to evaluate activities like the NSA's, which would however be desirable since we're giving large chunks of money to them that could be fruitfully used in making everyone safer if used to fight actual crime and not some fuzzy notion of terrorism.


> "it doesn't seem entirely irrational." (Yes it does...)

> "It doesn't have to be perfect to be better." (Yes it does...)

The problem used to be approached by presuming innocence (demanding perfection), rather than with a willingness to accept false positives (20 years ago spam filters weren't available as an analogy...). It is always possible to wrongfully judge someone, but it was never a valid or acceptable outcome ("It is better that ten guilty persons escape than that one innocent suffer" - Blackstone). We accept that spam filters give false positives (not to mention that one person's spam is another person's opportunity), so I think comparing the justice system to detecting spam is a mistake, and more over that a goal of "prevention" itself is a red herring.

The goal of prevention encourages us to accept lower thresholds of guilt probability, and that is wrong. In other words, if prevention is an end, then it is worth deliberately (rather than accidentally) restricting innocent people on the basis of virtually any nonzero probability of guilt. 80% "guilty" by association (for using Tor for example), 45%, etc, would all be enough to justify legal action - and the thresholds would certainly depend on whoever is in power and has access to the database that week. This is a very different model than presuming innocence, and having not only a goal of 0 false-positives, but also providing satisfaction when the justice system is in error.

I think today we are mostly talking around the fact that a crime has to have been committed in order for it to deserve to be punished, and that, for that reason, prevention cannot be a valid goal in itself (but it's nice when it happens).

Rationalizing surveillance as a tool to "prevent" rather than to justly punish wrongdoers (which centralized surveillance does not do because it is centrally operated, due to the conflict of interest; everyone owning a camcorder on the other hand...) implies that the central database needs to go IMHO (and that individuals need to be empowered instead).


Hold on there friend. I was not suggesting we replace the judicial system with a filter. Rather arrests.

I.e., make the arrest based on the filter, then run the trial in the same old jury-of-your-peers.

Convictions should be false-positive-free. But our system would not work if arrests also needed to be 100% false-positive-free.

I'm also not advocating punishment for crimes that have not yet been committed. Rather, think of it as looking for flags for crimes that have already been committed or are in progress. For example, there are all sorts of small flags thrown by embezzlement or salami-slicing that, put together, identify the operation.


> make the arrest based on the filter, then run the trial in the same old jury-of-your-peers.

LOL, jury of peers. You mean the jury that is left after the prosecutors and defenders screen out the most competent jurors. The same jurors that typically believe you are guilty because you've been arrested. Have you been in the typical criminal courtroom lately? Any public defender will tell you that going to trial in cuffs and jailhouse orange will almost certainly get you a conviction.

There are lots of things that need to be fixed in the justice system. Lets not give them more tools to make it worse.


Granted, arrests are held to a different standard than convictions in that they merely require "probable cause" rather than proof of guilt and this lower standard does make it look like the spam filtering analogy scenario may fit - but in calculating this new "guilt probability" our spam filter is relying increasingly on the "testimony" and "facts" presented by the surveillance database itself and it is the objectiveness of this database in practice, or rather the ones accessing it, that I am directly calling into question (though I didn't elaborate above).

Unfortunately, the database cannot be trusted by virtue of its centralized nature and administration (even if that centralization is justifiable, for example to protect everyone's privacy). The hardware may be objective, but people are not - people lie cheat and steal when they can get away with it - and there are simply too few separate and competing interests to hold the small number of people with access to the database and tools accountable for their inevitably selective use of them to ensure their objective application. We have seen centralized data collected and used for private interests (and books censored, and guns regulated, and...) in the past, be they fascist governments or police protectionism (lying under oath; evidence tampering; racial "profiling"), economic fraud, etc. It is human nature to use one's control to his advantage, and it is simply too tempting for police to shoot first (detain, seize, etc), especially when it is in their interest, and ask questions later (check the database for cause; use "parallel reconstruction"; incriminating speech taken out of context).

It would be worse if that extended all the way to conviction, but it presents the same kind of problem for arrests, detainments, and searches, etc, since it is effectively the word of the administrators (who we trust not to abuse the data and tools) against the person arrested. The more centralized the data and tools become, the less we can trust them to be applied objectively without accountability.

Unfortunately, there are no checks and balances on absolute power (centralization), and so we cannot allow centralization to continue indefinitely. Absolute power corrupts, absolutely, and it is my "thesis" that arrests are not a suitable application of these tools. The risk is too great. Police already have a high level of responsibility (the authority, training, and tools/weapons to control use by force) and what feels like decreasing accountability (because the kids, because the drugs, because I said so, because I can, because of cronyism, and because wealthy people don't like hearing criticism), and since they are none the less "only human" - I don't recommend giving them more.

Granted, you are merely describing a potentially objective algorithm, but my point is that the objectivity of any given tool is moot given the human element. Guns don't kill people, people do, and will continue to do so even with checks and balances (like laws against murder; if prevention was the goal we fail daily). It is only the distribution of accountability (peer juries, private key sharing, democratic voting, citizen groups, etc) that keeps such roles in check.

Anyways, thanks for the opportunity to flesh my thoughts out more.


I guess my theory partly depends on the filter being too sophisticated for any one person to co-opt. We can design machine learning, but there can't be many people who are capable of wrapping their head around a running machine learning system, and be able to reach in right here and peek/poke some weight and bam your nephew is arrested in Texas. On the bright side, most of those people are probably not officers, whom you seem to be most afraid of.

As for the objectivity of feeding the filter data, I envision something completely automatic. No selective entry for this or that suspicious person- the filter is fed a database of all people, and perhaps monitors the internet's traffic on its own. Maybe ACH traffic too. Financial crime could be this system's biggest win- computers are way more suited to uncovering financial crime relative to humans.

Basically, when it's big enough and sophisticated enough and automated enough that no one person can fully understand it, it becomes significantly harder to pervert. And, as I mentioned before, it needn't be perfect- our current system is pervertable too (see: papers please, racial profiling, etc), so this one would just need to be less pervertable...


Then why does the system not look out for corrupt politicians, or black military budgets? Because the filters are not tuned well enough yet, as if that will ever be an objective? I'd say it's because it's not a spam filter that filters for spam.. more like a spam filter that filters out the spam of the competition, lets yours through, and kills emails warning about this. Call me paranoid, but until the big guns are primarily used to catch the big villains, this is what I see.


The 'arrest filter' is only as good as the inputs. As it currently stands, I'm sure that things like "uses drugs recreationally," "is black," "is Muslim," and "is not Christian," would end up counting towards your 'arrest score.' And also because your arrest score is computed by a machine rather than a human, that will be used as an excuse to call it unimpeachable. E.g. "Machines can't be racist, so the arrest score going after lots of poor, Black men must mean that there's something to it."


Only as good as the inputs, yes. But if it's a halfway decent filter, it will include machine learning, e.g. a Bayesian filter, and if "is Muslim" turns out to have low correspondence with actual criminal activity that input will quickly be deweighted. Or perhaps paired with other aspects- e.g., perhaps "is Muslim" is of no consequence and "Googles Jihad" is also of no consequence, but "is Muslim" && "Googles Jihad" gives you a point. Just as one example of the patterns a good filter could recognize.

Machines can't be racist, so the arrest score going after lots of poor, Black men must mean that there's something to it.

If a learning Bayesian filter targets a certain demographic, there probably IS something to it.

That really would be amusing/pleasing, if all this work we've spent developing spam filters became the lead-up to an accurate, learning crime filter. Perhaps the fork to spamassassin will be known as crimeassassin?


> If a learning Bayesian filter targets a certain demographic, there probably IS something to it.

I'm pretty sure both Bayes and Laplace would not agree with the categorization of Bayesian probability as some sort of panacea for determining truth in criminal matters.


Yes, it is just probability and not truth. But an arrest is often a "guess" with some degree of confidence, not absolute certainty. Thus is a "weight" or "probability" not perfectly applicable?

Trials & convictions is a different matter, more suited to truth-seeking.


Don't disregard the effects that 'just an arrest' have on people. For example, being arrested for child porn-related charges, but not charged / convicted isn't exactly a no-op.


Not for determining truth, but the degree of confidence in it, yes. Probability is the tool to get to the truth, if you allow for your information to be incomplete and uncertain.


Perhaps you should watch "The Thin Blue Line" a few hundred times... and then repost...

I would be seriously concerned about a Bayesian filter being applied as the sole reason for arrests...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: