A Quote by Oren Etzioni

I like to say I've been working on big data for so long, it used to be small data when I started working on it. — © Oren Etzioni
I like to say I've been working on big data for so long, it used to be small data when I started working on it.
TIA was being used by real users, working on real data - foreign data. Data where privacy is not an issue.
Big data has been used by human beings for a long time - just in bricks-and-mortar applications. Insurance and standardized tests are both examples of big data from before the Internet.
People think 'big data' avoids the problem of discrimination because you are dealing with big data sets, but, in fact, big data is being used for more and more precise forms of discrimination - a form of data redlining.
I'm going to say something rather controversial. Big data, as people understand it today, is just a bigger version of small data. Fundamentally, what we're doing with data has not changed; there's just more of it.
Big data is great when you want to verify and quantify small data - as big data is all about seeking a correlation - small data about seeking the causation.
People believe the best way to learn from the data is to have a hypothesis and then go check it, but the data is so complex that someone who is working with a data set will not know the most significant things to ask. That's a huge problem.
AIs are only as good as the data they are trained on. And while many of the tech giants working on AI, like Google and Facebook, have open-sourced some of their algorithms, they hold back most of their data.
While I was there, Voyager flew by Saturn. I got involved with a person who was a member of the imaging team and started working on data from Saturn. With all that data coming in, the imaging team didn't have enough hands or scientists to work on all of it.
This is where the world is going: direct access from anywhere to any type of data, whether it's a small piece of data or a small answer but a long algorithm to create that answer. The user doesn't care about this.
The biggest mistake is an over-reliance on data. Managers will say if there are no data they can take no action. However, data only exist about the past. By the time data become conclusive, it is too late to take actions based on those conclusions.
I'm not from a political family and didn't grow up dreaming of being George Washington. I started working in 8th grade and have held every odd job possible - working in a gravel pit, weighing big wheelers, ticket sales, data base management - but I knew if I worked hard and got experience, I could apply that experience to my next endeavor.
Biases and blind spots exist in big data as much as they do in individual perceptions and experiences. Yet there is a problematic belief that bigger data is always better data and that correlation is as good as causation.
We all say data is the next white oil. [Owning the oil field is not as important as owning the refinery because what will make the big money is in refining the oil. Same goes with data, and making sure you extract the real value out of the data.]
Tape with LTFS has several advantages over the other external storage devices it would typically be compared to. First, tape has been designed from Day 1 to be an offline device and to sit on a shelf. An LTFS-formatted LTO-6 tape can store 2.5 TB of uncompressed data and almost 6 TB with compression. That means many data centers could fit their entire data set into a small FedEx box. With LTFS the sending and receiving data centers no longer need to be running the same application to access the data on the tape.
As individuals, we have very little say about how our data is being used. I'm not worried about the privacy implications of it so much. But it seems to me that, as an individual, if I'm the one generating the data, I should have some kind of say in how it's going to be used.
When I was working in Japan, I created a system for ensuring that intelligence data was globally recoverable in the event of a disaster. I was not aware of the scope of mass surveillance. I came across some legal questions when I was creating it. My superiors pushed back and were like, "Well, how are we going to deal with this data?" And I was like, "I didn't even know it existed."
This site uses cookies to ensure you get the best experience. More info...
Got it!