A Quote by Oren Etzioni

Machine learning is looking for patterns in data. If you start with racist data, you will end up with even more racist models. This is a real problem. — © Oren Etzioni
Machine learning is looking for patterns in data. If you start with racist data, you will end up with even more racist models. This is a real problem.
We are going to completely change what it means to do advanced analytics with our data solutions. We have machine-learning stuff that is about really bringing advanced analytics and statistical machine learning into data-science departments everywhere.
If we gather more and more data and establish more and more associations, however, we will not finally find that we know something. We will simply end up having more and more data and larger sets of correlations.
Simple models and a lot of data trump more elaborate models based on less data.
Sometimes if I really want to get someone's attention, I'll start a sentence with something like, "I'm not racist, but..." I say, "I'm not racist, but you look great today." They say, "That wasn't racist at all." I said, "I know. I said I'm not racist. You never listen. Typical Mexican."
People think 'big data' avoids the problem of discrimination because you are dealing with big data sets, but, in fact, big data is being used for more and more precise forms of discrimination - a form of data redlining.
We should always be suspicious when machine-learning systems are described as free from bias if it's been trained on human-generated data. Our biases are built into that training data.
The paradigm shift of the ImageNet thinking is that while a lot of people are paying attention to models, let's pay attention to data. Data will redefine how we think about models.
And my point was one I think that you'd agree with, which is there's no room in America for a black racist, a Latino racist, or a white racist, or an Asian racist, or a Native American racist. Now, we're either color blind or we're not color blind.
People believe the best way to learn from the data is to have a hypothesis and then go check it, but the data is so complex that someone who is working with a data set will not know the most significant things to ask. That's a huge problem.
Everything is changing now that we are in the cloud in terms of sharing our data, understanding our data using new techniques like machine learning.
The bigger a data set that you have, the more polls, the more surveys that you have that people undertake, the more accurate your models are going to be. That's just a fact of data science.
TIA was being used by real users, working on real data - foreign data. Data where privacy is not an issue.
We get more data about people than any other data company gets about people, about anything - and it's not even close. We're looking at what you know, what you don't know, how you learn best. The big difference between us and other big data companies is that we're not ever marketing your data to a third party for any reason.
When you have a large amount of data that is labeled so a computer knows what it means, and you have a large amount of computing power, and you're trying to find patterns in that data, we've found that deep learning is unbeatable.
With too little data, you won't be able to make any conclusions that you trust. With loads of data you will find relationships that aren't real... Big data isn't about bits, it's about talent.
I was interested in data mining, which means analyzing large amounts of data, discovering patterns and trends. At the same time, Larry started downloading the Web, which turns out to be the most interesting data you can possibly mine.
This site uses cookies to ensure you get the best experience. More info...
Got it!