Hacker Newsnew | past | comments | ask | show | jobs | submit | more kevcampb's commentslogin

They've probably modelled the failure rate for the first one you bought.


"Just avoid holding it in that way"


An expectation that you wipe your sweat off something before putting it away doesn't seem that unreasonable.


It really is my fault, but it is not like they are dripping in sweat. It is just the tiny sheen of sweat left over after running and my desire to immediately put them in the case after using to avoid losing them. Now I wipe them down and leave them out for a while before putting them back in the case. This makes them much less handy.


honestly, i would expect them to be at least sweat proof -- i find that most people use wireless headsets while working out.


Yeah, I really did think they were. The phone and the watch are waterproof. I don't know why the AirPods are not.


I can see the ad campaign two years from now: Swimmer taps watch to play then dives into a backstroke with airpods visible.

Then you try it at your local pool and the airpods immediately fall out the minute you hit the water, and are kicked by a 6 year old into the pool filter.


Is machine learning really to blame for the reproducibility crisis? I'm not in academia, but it seemed to me that the problem was entirely present without machine learning being involed.

For example, Amgen reporting that of landmark cancer papers they reviewed, 47 of the 53 could not be replicated [1]. I would have assumed that most of them didn't involve 'machine learning'

[1] https://www.reuters.com/article/us-science-cancer/in-cancer-...


The problem was there before, but there are reasons why Machine Learning is amplifying bad practices.

In the past people were manually fishing for results in available datasets. Now they have algorithms to do it for them.

In medicine a popular way to use ML is to improve diagnosis. Now there's already a problem in medicine that the benefits of early diagnosis are overrated and the downsides (overtreatment etc.) usually ignored. You get more of that.

And TBH computer scientists aren't exactly at the forefront when it comes to scientific quality standards. (E.g. practically noone is doing preregistration in CS, which in other fields is considered a prime tool to counter bad scientific practices.)


It seems what ML is really doing is exposing weaknesses in our scientific processes. The appropriate response here is to fix the processes, instead of blaming the latest fad and imploring people to "try harder". If the root cause isn't fixed, the next fad after ML will cause the same thing again. What feasible systemic changes can we make so that scientists can't get away with publishing sloppy results?

It's not an interesting question for many scientists, who prefer focusing on technical solutions over political ones.


I think it is a very interesting question for a lot of scientists, but also an extremely hard one to answer. And an even harder one to implement.

Even obvious wins that almost everyone can agree on, like getting publishing out of the hands of for-profit entities that add no value is taking forever, because cultural, social and political institutions are hard to move.


If we broaden Machine Learning to apply to data fitting tools, I can see how it would apply broadly and with no ill intent to produce errors.

Consider the simple task of peak fitting to determine the result for some data. You're probably using a commercial tool to identify the peak position, calculate the baseline, and come up with parameters for your model.

But if there's an error, and at least when I was in grad school the tools often would get stuck in weird local minima that take experience to recognize, it could easily just never be noticed. If your baseline is way off, good luck calculating your peak areas reproducibly...

Data analysis is hard, and it's easy to trust algorithms to be at least more reproducible than doing it more manually. Plus side if you provide your dataset and code others can at least redo the analysis! Really excited to see more Jupyter notebooks used for publications in the future.


Medicine may be better than ML but there’s not much in the difference.

> COMPare: Qualitative analysis of researchers’ responses to critical correspondence on a cohort of 58 misreported trials

> Background

> Discrepancies between pre-specified and reported outcomes are an important and prevalent source of bias in clinical trials. COMPare (Centre for Evidence-Based Medicine Outcome Monitoring Project) monitored all trials in five leading journals for correct outcome reporting, submitted correction letters on all misreported trials in real time, and then monitored responses from editors and trialists. From the trialists’ responses, we aimed to answer two related questions. First, what can trialists’ responses to corrections on their own misreported trials tell us about trialists’ knowledge of correct outcome reporting? Second, what can a cohort of responses to a standardised correction letter tell us about how researchers respond to systematic critical post-publication peer review?

> Results

> Trialists frequently expressed views that contradicted the CONSORT (Consolidated Standards of Reporting Trials) guidelines or made inaccurate statements about correct outcome reporting. Common themes were: stating that pre-specification after trial commencement is acceptable; incorrect statements about registries; incorrect statements around the handling of multiple time points; and failure to recognise the need to report changes to pre-specified outcomes in the trial report. We identified additional themes in the approaches taken by researchers when responding to critical correspondence, including the following: ad hominem criticism; arguing that trialists should be trusted, rather than follow guidelines for trial reporting; appealing to the existence of a novel category of outcomes whose results need not necessarily be reported; incorrect statements by researchers about their own paper; and statements undermining transparency infrastructure, such as trial registers.

https://trialsjournal.biomedcentral.com/articles/10.1186/s13...


I am just in the process of digging into this paper and covering it in an article, so I'm quite familiar with it.

But as bad as this is: What the COMPare project is doing here is documenting the flaws of a process to counter bad scientific practice. The reality in most fields (including pretty much all of CS and ML) is that no such process exists at all, because noone even tries to fix these issues.

So you have medicine where people try to fix these issues (and are - admittedly - not very good at it) versus other fields that don't even try.


This is definitely true. A common "blueprint" for articles in applied CS is first they propose a "novel" algorithm. This algorithm may be very similar to an existing algorithm and in many cases is identical to one. Then they benchmark the algorithm and shows that it performs better on some metrics than existing solutions. These benchmarks are often quite poor, and if you vary them a little, the purported performance increases vanishes.


Are you planning on touching on autoML? Do you think that could help?


its quite a stretch to say that CS does not have reproducibility ! .. probably want to define that a bit


Can you give a citation about ML being used this way and producing overtreatment? I'm aware of experimental results with e.g. Watson that were first overstated and then rejected, bit that seemed like a situation where an institution experimented with a poor use of ML and successfully rejected it, not where they falsely accepted an ML result.


No but it’s making it worse because it’s giving false confidence in results and amplifying failures in experiment design. It’s also been held out as a fix for the reproducibility crisis but just as lots of statistical analysis has been done by people who don’t really understand the math but are just cargo culting other experiments, machine learning is taking that ignorance-of-your-toolset risk to the next level.

Not even the people who write the software understand what patterns are being found. All they can do is point to the results that seem good at a glance. But while we can train software to beat humans within a constrained dataset with careful checking, these techniques cannot find new insights. The software does not understand what the data it’s processing represents and thus it doesn’t recognize the abstractions and limits of the data. The patterns it finds are in the low resolution data, not in reality that data is a poor copy of. But science needs to be analyzing the real world and to do that you must comprehend the errors in your data and what they mean. We are nowhere close to making software that can do that.


  Not even the people who write the software understand what patterns are being found. All they can do is point to the results that seem good at a glance. 
This isn’t really true. For example, we can pass in an image to a convolutional net and see which filters are activated; this can give us a clear indication if it’s edge detectors that are activating or textures or specific shapes (eg a dog would activate edge detectors, textures that look like fur, and shapes that resemble a dogs face). We can also train models to disentangle its representations and make specific variables stand for specific things (eg for a net trained on handwriting,values in one variable can represent the slant of writing, another one the letter, another the thickness, etc.). There is also a ton of work being done in training causal models. We also have decent ways now of visualizing high dimensional loss surfaces.

the field has come a long way since 2012, and the whole “it’s magic, we don’t understand why it works or what it learns” is no longer true.


Some of the points are true specifically the last part but you are wrong about the fact that people who write these software doesn't understand what patterns are being found. We can clearly see in ML and Deep models why the decision was made by the hypothesis using various libraries such as eli5, Tensorboard and others. Deep Learning models are in general harder to debug but still possible.

Therefore we know why hypothesis produces wrong results but sometimes it not possible to mend the model due to outliers, rare events, lack of data and/or randomness that surrounds our world. Just as you point out that statistical analysis is done by people who don’t really understand the math, these false result can be due to scientists using ML without understanding its advantages and limitations.


The problem with ML is the same one that brought about the reproducibility crisis (RC) — believing that simply exceeding one predefined threshold for some probabilistic metric (like correlation or p value) is 'good enough'.

Of course the problem is compounded if we also fail to propose a causal mechanism and you don't try to validate it — something I see data scientists doing all too often since we seldom employ anything like Design Of Experiment practices, and the data we're working with is very rarely created by us.

IMHO, the RC is a reminder to scientists that to confirm a hypothesis you need to pass more than one test, and a reminder to data scientists that we must test using more than one model.


Machine learning trivializes p-hacking. Take a database of random datums. Pick e.g. 3 input datums at random and map them against one manually chosen output datum. Run the machine learning system and observe the error rate. If it decreases below some value 'p' you now have a [most likely completely spurious] correlation. Spin up an explanation for it - the more sensationalized the better. Claim that the process was done in the reverse order, claim it's science, publish -- you now have a ground breaking hypothesis that was validated by experimentation.

One of the big reasons that the more modeling, variables, and filtering in a study - the more you should discount it. It's too easy to prove something when there's nothing actually there.

An even bigger risk here is that you can engage in the above process and spot check against other data sets to see if it can be validated elsewhere. And you can find correlations that are predictive, yet are in no way whatsoever causal. If we took a sample with enough data on all individuals in the US you'd be able to find some correlation that people who have an E as the second letter in their name, a last name of five characters in length, and went to a high school whose third letter is 'A' have a 23% higher earned income average than those outside the group. And it predicts going forward.

You'd be mapping onto something that obviously has nothing to do with these variables in and of themselves. Perhaps the real issue would be it's simply a very obscure proxy to a certain group of individuals in a certain subset of educational institutions. But the problem is that this is only obviously spurious (even if predictive) because these sort of variables clearly cannot have any sort of a causal relationship. When instead you only look at a selection of variables that, in practically any combination, could be made to seem meaningful through some explanation or another - you open the door to completely 'fake' science that provides results and even predictivity, but has absolutely nothing to do with what's being claimed. So people might try to maximize towards the correlations (which are/were predictive) only to find nothing more happens than if people started actively making sure the second letter of their children's name was an E and legally changed their last name to 5 letter ones.

---

As a pop culture example of this something similar to this happened with video game reviews. Video game publishers noticed that there was a rather strong correlation with positive game reviews and high sales. So they started working to raise average game scores through any means possible, eventually including 'incentivizing' game reviewers to provide higher scores. As a result game reviews began to mean next to nothing, and the strength of the correlation rapidly faded. Because obviously the correlation was never about high review scores, but about making the sort of games that organically received high review scores. Though in this case we already see "obviousness" fading, since there was some argument to be made that the high review scores were what was driving sales in and of themselves - though that was clearly not the case.


No one uses or needs ML for overfitting 4 variables. You can do that with regular statistics just fine. And how you interpret ML results is just as fraught with error as any statistical argument—just because the technique gives you some result doesn’t mean it’s explanatory, that is science 101.


> No one uses or needs ML for overfitting 4 variables. You can do that with regular statistics just fine.

True, but an ML routine can try an approximately infinitely greater number of models. If I'm using x,y and z to predict w, and I've tried all the linear terms, all the squared terms, all the interactions, all the log terms, and I start throwing in other things, my readers will, rightly, raise an eyebrow. Maybe there's some discontinuity to exploit, but if so, I'll explain it -- a policy change for people age 65 or older, say, or a market that exists in one state and not an adjacent one.

The ML, by contrast, can invent the most absurdly jagged multivariate functions imaginable, and we typically* don't even know that it's doing so, let alone why.

*as others have written here, we can actually investigate the how (not the why) by inspecting the algorithm -- but the number of papers that do is much smaller than the number that don't.


The mandatory xkcd significance link https://xkcd.com/882/


City full of tech companies working on facial recognition, bans facial recognition being used on it's residents


The Ocean Conservancy report has been cited by quite a few articles now, and it pains me every time I see it. As ever, the conclusions drawn are quite misleading.

1) The source of the data is cleanup on beaches, not ocean trash. If anything it's more likely to be representative of items discarded on beaches or waterways.

2) Cigarette butts are greatest purely in number of individual pieces, not in terms of weight or volume.

The following paper reports fishing nets to be the largest contributor to ocean plastic (by volume). Cigarette butts aren't even worthy of a mention in their report (see Supplementary Table 4 which lists the top 5 items per item size group).

https://www.nature.com/articles/s41598-018-22939-w


Then it was also 10 Asian rivers that account for most plastic: https://www.scientificamerican.com/article/stemming-the-plas...

I guess I understand it is a hard problem to calculate, but it is an area where I wish the approach was more reasoned and less governed by fad and hype. Sure clean and prevent straws and cigarette butts if that’s really a problem, but divvy our resources in proportion to where data says the issues are...


I'm probably what you'd call an environmentalist, but the movement is wrought with people who are more concerned with getting high off of self-flagellating moral outrage than fixing problems. Legislative solutions to large polluters and trash generators don't feel like penance in the same way that individual sacrifices do, so a huge chunk of people who are nominally concerned with environmental issues misallocate their complaints, wrt the actual severity of polluters.

I'm not sure this is solvable, humans being what they are, and I don't think it's unique to environmental issues.


We probably need (but don’t want) algorithmic control of our actions.


What else is a constitution?


I'm able to complain about plastic and cigarette butts in parallel.


[flagged]


Perhaps tone down your externally-flagellatoriousness if you're in agreement with somebody


What do you mean by "externally flagellating"?


It's possible, and this is just a suggestion in the nicest way possible, that he may be referring to his own face. That or he just made a word up. How intelligent!


I did respond to the following:

"issues misallocate their complaints, wrt the actual severity of polluters"

No need to be sarcastic.


The Asian rivers was something I learned fairly recently. It wasn't a big shock, but it wasn't obvious either.

Do you know if the report also broke down the rivers by size and waterfront population? IE, do the communities along those Asian rivers pollute more than those in the US or EU, or are they simply the largest rivers in the world by volume and/or local population?

(Yes, I know the Ganges in filthy)


I imagine, being in China, the pollution is also an externality of the manufacturing for the entire world that Europe/US is happy to occur in China instead of locally.


Yeah, western countries can outsource all manufacturing there, lower their businesses costs, and then play holier than thou on green matters too...


Severly poor communities without any wastewater treatment letting effluent into the river they live next to is the problem. It is the prime recipe for Cholera.

So EU/US usually makes cleaner wastewater, but uses a lot more energy (so pollutes by CO2).

Manufacturing's externalities outsourcing is of course something that should be handled by import tariffs and supply chain verification.


Sure, but I was asking specifically about the plastic waste and other trash, which was the original topic. Not general pollutants.


For any kind of pollution, the prime polluter has incentive to spread factoids pointing to significant other polluters.

The comment you respond to mentions fishing nets to be the biggest pollutant by volume (which is quite logical if one thinks about it). Obviously a marine source has incentive to highlight landbased sources such as rivers. If one hears about the most polluting river one is easily confused into thinking rivers are the biggest polluter. Also note the questions surrounding metric of river pollution (i.e. per capita pollution of rivers and so on ...)


God can’t they engineer some kind of filter system at the ends of those rivers?


Cigarette butts are really a problem ... at least for me. They are disgusting. So even though they are not a significant source for ocean plastic, they should be avoided or cleaned.


It's television "news". Why would anyone expect better? The headline tomorrow will say/read, "A new study shows ..." and state the exact opposite.


If we combine that data with data on annual amount of plastic being thrown into the ocean, we get 4 tons of fishing nets per year being lost by the part of the finish industry that use fishing nets.

Buying new fishing nets are quite expensive, and by experience, fishermen tend to spend a large part of the year repairing them. I wonder what the lost rate is per fisherman, national region, and caught fish.


Skimming the articles it estimates there are 42000 tons of mega(large) plastics, of which 86% would be fishing nets, so 36120 tons. Didn't find the per annum data, but that's certainly much more than 4 tons a year (else it would've taken 9030 years to get current weight), I'd guess at least 2000 tons per year.


The paper you cite is just about the Great Pacific Garbage Patch.


The general problem with this kind of criticism is that it can be leveled at any article citing a statistic.

No matter what value an article chooses to highlight they could have highlighted a different one. True... but so what?

Here I guess the root of your objection is that the article seems to conflate the ocean-sides with the entire ocean. That's fair: the article is really about ocean sides, but in a few places it refers generally to the ocean. It appears to be purposeful, too, since it's in the headline and lede while the rest of article is straight. Still, that the headline and lede is overly strong is, unfortunately, almost a universally applicable criticism these days. Headlines and ledes are written for attention-grabbing not accuracy. That's a problem not of individual articles, but of a system that lives on click-through.

A less superficial criticism would be to examine the degree to which the main point of the article -- that cigarette butt pollution is a problem that should be addressed -- is fair, or whether it's making this out to be a significantly more substantial problem than it really is. (Remember, the context here is activism to reduce pollution, where there was recently a lot of focus and traction on plastic drinking straws, for which there was some backlash, with people arguing that there was a lot of focus on a tiny part of the problem. So a really good question here is, is this just more plastic straw BS or is there a really issue here?)

And keep in mind that if cigarette butts are a coastal pollution problem it doesn't mean something else, like fishing nets in the ocean, isn't a problem.


>The general problem with this kind of criticism is that it can be leveled at any article citing a statistic.

Only if it cites a badly done statistic or misleads as to the statistic's conclusions, as this article does.

>No matter what value an article chooses to highlight they could have highlighted a different one. True... but so what?*

That's not the accusation here. It's that what they chose to highlight is not the most important pollutant of oceans by any reasonable metric. The purpose of citing a statistic in the first place is to consider things in their respective relevance.

Now, restricting this to the ocean side (which the article editors should have done already in their chosen title), it doesn't seem to be such a problem either (and I'm from a country where most smoke and has tons of beaches).


At peak times in Hong Kong's subway, it pains me to see one side of the escalator basically empty. Normally I walk up, so that should make things faster for me, except it's impossible to get to the escalator for the queue of people. If people just used both sides it would be faster - for everyone.


In the Shanghai subway system, the normal case is that one half of the escalator walks and the other stands still. But when the crowd is large, population pressure forces both halves to stand still.

This experience made me surprised that Tokyo would need to implement such a system top-down; I figured their crowds would be large enough that it was happening anyway.

Looks like "better" queueing behavior may be responsible for the failure.


You often need that first person to stand still.

I now take it upon myself to be the first person” sometimes. If I see a long line forming to stand, and no one is walking at all, I will stand in the “walking” lane.

After that I see a lot of people with relief on their faces when they stand behind me.

I guess it also helps that I’m bald, Asian guy with a long beard. so maybe they think I’m a bum. Otherwise there were times I’ve seen people admonish the standers.


One important thing to note.

PCCW the main local telco uses Hong Kong ID numbers as passwords by default, or at least they used to do so. This means that this database contains usernames and passwords in cleartext for a significant number of users who have never changed their accounts.



It cannot

Source: I worked in a related field for five years, and evaluated a lot of technologies from major players.


Future job role: Sesame Credit Optimisation (SCO)

There's going to be so many ways to game this system when it gains wider adoption, it's going to be insane.


Multiple keywords on your resume (S3, AWS, Lambda, Serverless)


If I interviewed someone who explained that as a solution, it would definitely not help their case. The worse hire is someone who overcomplicates projects. It leads to “negative work”.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: