Oracle's CEO Presents at Big Data Primer - Financial Analyst Webcast (Transcript)

Oracle Corporation (ORCL) Big Data Primer - Financial Analyst Webcast Conference Call October 8, 2013 12:00 PM ET

Executives

Shauna O'Boyle - Senior Manager of Investor Relations

Andy Mendelsohn - Senior Vice President, Database Server Technologies

Analysts

Brendan Barnicle - Pacific Crest

Operator

Good day, ladies and gentlemen, and welcome to the Oracle Big Data Primer Webinar Call. At this time, all participants are in a listen-only mode. Later, we will conduct a question-and-answer session and instructions will follow at that time. [Operator instructions] As a reminder, this conference call is being recorded.

I would now like to introduce your host for today's conference, Ms. Shauna O'Boyle. Ms. Shauna, you may begin your conference.

Shauna O'Boyle

Thanks. Hello everyone, and thank you for joining us today as part of our ongoing educational speakers' series hosted by Oracle. I’m Shauna O'Boyle, Senior Manager of Investor Relations, and today is Tuesday, October 8, 2013. Joining us today is Oracle’s Executive Senior Vice President, Andy Mendelsohn; and Equity Research Analyst, Brendan Barnicle of Pacific Crest.

Today, Andy will be discussing Big Data. However, he will not be discussing any data that is not already publicly available. At the conclusion of Andy’s presentation, we will turn the webcast over to Brendan who will moderate the question-and-answer portion of the call.

However, you may submit questions at any time during the presentation by typing your question in the Q&A box in the lower part of your screen. Please keep in mind that we will not comment on business during -- in the current quarter.

As a reminder, the matters we’ll be discussing today may include forward-looking statements and as such are subject to the risks and uncertainties that we will discuss in detail in our documents filed with the SEC, specifically the most recent reports on Form 10-K and 10-Q, which identifies important risk factors that may cause actual results to differ from those contained in forward-looking statements.

You are cautioned not to place undue reliance on these forward-looking statements, which reflect our opinion only as of the date of this presentation. Please keep in mind that we are not obligating ourselves to revise, update, or publicly release the results of any revisions of these forward-looking statements in light of new information or current events. Lastly, unauthorized recording of this conference call is not permitted.

I would now like to introduce Andy Mendelsohn.

Andy Mendelsohn

Thanks Shauna. Good morning, everybody. So I’m going to give a very short, about 15-minute chat about Big Data and then we’ll take questions.

I thought I’d first start talking about what is Big Data. There’ve been all kinds of people talking about what Big Data is. There is statement, 3Vs or 4Vs of Big Data and all the various new start-up companies that have anything to do with information are planning their Big Data.

So what is the real kernel of what’s going on here? Well, the real kernel is that -- what we’re talking about here is it’s all about analytics. People have been doing analytics for 30, 40 years with information systems, and what we are talking about here is moving to the next generation of those analytics, which we’re calling Big Data analytics.

And there are two key transformations going on in the industry that are driving the Big Data trend, and number one is really about different kinds of data. So, traditionally analytic systems have been looking at data from company’s operational systems; so for example, if you’re a big retailer, you’re looking at data from your retail sales, the sales of information from products at your retail storage. If you’re Telco, you’re looking at your call data records that record all the phone calls and how long they’ve been there and what the charges are and things of that sort.

So that’s analytics on your operational data, and what people are talking about what Big Data is looking at more kinds of data than traditionally they’ve been looking at. And some of these data are sources that are from inside the company; things like looking at documents, more unstructured data, voice, video, and as we move to the Internet of things, people are looking at [sensor] (ph) data sources as well.

And on the Internet, there is also a huge amount of data, of course on the Internet and companies are looking at well, there is some way I can extract useful information out of social media. Now can I find information about my customers again, and what they are saying on social media about my company? Are they saying something good? Are they saying something bad? There’s a lot of bloggers out there saying all kinds of things that are potentially of interest to companies.

So people are very excited about the possibility of looking at this data, getting more information, especially about their customers and using that to deliver, raise the potential revenue of their companies by better marketing their customers that are up-selling them etcetera. So I think that’s the first big thing, moving from operational data source to these broader, internal and Internet sources.

The next big thing going on is new kinds of analytics are being done. So, traditionally people did what they call OLAP analytics, just slice and dice of data, mostly historical data from the operational systems, and we’ve been moving over the last years to more predictive analytics, data mining techniques for example to look at what’s going to happen in the future. And as we move to these broader kinds of data sources, people are looking at doing analytics against those kind of data sources, text, spatial.

Graph analytics has become very popular as you look at social networks, people want to know, run queries about who is a friend of who and based on that they do some kind of interesting marketing or a targeted advertising etcetera. We’ve also been moving to new kinds of analytic tools. R has become very popular as a development tool for doing analytics processing, and of course we’ve been moving in-memory in a lot of phases to do in-memory analytics, in-memory databases etcetera.

So, I think those are the two big key drivers of what’s going on in Big Data. And so, what are we doing in Oracle to deal with this new world? Well, Oracle of course has for many years been the market leader in analytics, and we have the world’s leading data warehousing technology, DI Technology with our database and our DI products as well.

And so, what we’re doing as we move forward to the Big Data space is we of course are trying to build a platform and a set of solutions that deal with Big Data. We need to be able to acquire, organize, discover, and analyze Big Data. And as we move forward, in addition to the Oracle massively parallel relational database, we’re doing analytics against Big Data. We’re also now moving to utilize Hadoop as a platform in our Big Data environment. We’ve also come out with Oracle’s NoSQL database that is of value in the [adjacent] (ph) phase of Big Data, and of course, we’re continuing to evolve our Big Data tools as well.

One of the big things we’re doing at Oracle that I think is very important in the Big Data space is we are in the business of delivering what we call engineered system. These are combinations of hardware and software that are integrated together to deliver great off-the-shelf experience for customers where they can order our products, order our hardware and software and get great time to value where they can very quickly build systems.

And we’ve done that in the Big Data space. We have our Big Data Appliance for running the Hadoop processing on our Oracle NoSQL database as well. We have Oracle Exadata. It’s a great platform for doing massively parallel analytics, and then of course, we’ve finally added Exalytics, which is our platform for our DI tools and our Endeca Data Discovery tool.

And finally on the top, Oracle is of course in the applications business and so we are also in the business of delivering horizontal and vertical applications and solutions to our customers and we’ll continue doing that on top of this stack of Big Data technologies as well. So that’s one of our key strategies and now let’s move on sort of a little closer to the products that we’re actually are doing.

So in this picture, we’re showing of course on the left side, the data sources that we talked about, all kinds of data sources both operational and new kinds of data sources like documents and blogs and information of the internet etcetera. And what we’re showing is a little different now is in the past what you would do is you take this information and you might put it into various files and you do staging operations, do ETL operations to transform the data before moving into a data warehouse.

Now what we’re showing in this next generation platform is using the Hadoop and Hadoop’s HDFS Distributed File System as a way of ingesting large amounts of this data and then we will use the Hadoop MapReduce platform to do some batch processing and ETL transformation against that data, skip through the data, look for interesting tidbits before we then use our Big Data connectors and load that into the Oracle Exadata data warehouse.

We also show our Oracle’s NoSQL database here as well. NoSQL databases are also very good in ingesting information rapidly and also again doing some operations against it and then that data can be also again moved into a data warehouse and that’s [very].

On top of this basic platform of Hadoop and the Oracle Database, we have our analytics engines. We have in the database, our advanced analytics capability, which includes predictive analytics and our processing. We’re also making this kind of analytics especially are available on the Hadoop platform as well.

And then finally on top, we show Oracle’s business analytic tools like our BIE tool, our indicative discovery tool. Those tools also can be run against data books in Hadoop HDFS File System and in the Oracle Database to delivery visualizations and analytics against the data.

Okay, let’s go to next slide. And as I mentioned, a big part of our strategy is our Engineered Systems, so in this next slide I just sort of show how those Engineered Systems fit into the Big Data Solution we just talked about.

Of course the Big Data Appliance is our engineered system for running both Hadoop and for running the Oracle NoSQL Database. We are the first vendor by the way that’s producing an engineered system for optimized for running Oracle NoSQL or any other NoSQL database for that matter.

Exadata of course is our platform of choice from running big massively parallel data warehousing for doing your interactive analytics and Oracle Exalytics is our engineered system for running our BI analytics foundation and that includes BIE and the Endeca Data Discovery product and also our Essbase engine as well.

Okay. Let’s move on. What I want to do at this point is drill down a little bit on sort of or talk about how Hadoop and Oracle databases relate to each other because I think there’s a lot of confusion out there about what Hadoop is good for and what’s massively parallel relational database is good for. And the key thing to understand and what we’re doing in this platform is that in order to have a Big Data Solutions, you need both.

If you talk to even the people who are the biggest early advocates of Hadoop, who are now trying to do Analytics, what they decide is you know Hadoop is a great platform for ingesting large amounts of data at very low cost for terabyte and it’s a great platform for doing some analytics on that data, but it's more a batch processing analytics. So what does that mean?

Well it means, if you have a data science, if you’re sitting in front of your terminal and is asking questions for the bid, trying to understand the business and trying to come up with great ideas for raising revenue, better marketing, advertising etcetera, he wants interactive response. He wants, send in the query and get a response back in a few seconds.

That’s not really what Hadoop was designed for. Hadoop is designed to crank our scalable [badge] processing execution and you’ll get maybe tens of minutes or an hour response to those kinds of queries. What they want is snappy response. And that’s what you need a massively parallel relational database to do and that’s what these guys are doing. They’ll use Hadoop for ingestion and for doing some big bash processing analytics against Big Data.

But then they’ll move a sub set of the data that they want to do further analytics on into their massively parallel relational database, in this case Exadata and that’s where they’ll do their interactive analytics against the data using of course now rich SQL language that we’ve tried with Oracle.

The other thing to note is that, although there are some SQL tools available in the Hadoop environment. They are very primitive and raw, high, [big], etcetera. And then of course, you can quote in Java as well. But on the massively parallel relational database side of the world, what people do is they code in SQL for the most part. SQL is a very expressive and productive language, a couple lines of SQL is equal to hundreds of lines of code in Java.

It’s also much more efficient in processing as well, so it’s much faster and requires much fewer computing resources to get a given job done. So people also like the fact that relational databases are just much more productive environments than Hadoop is today.

We also mentioned R; R is of course something that we made available as a statistical programming language and predictive analytics language in both the Exadata platform and we’re now also making available on the Hadoop platform as well.

Let’s go to the next slide and here I just wanted to give -- this is I guess slide taking out my key note from [inaudible] so you’re all welcome to go to oracle.com and take a look at my keynotes there I did in Oracle database 12c. But what this did was, it's sort of the end result of example we went through, where we showed, let’s say you want to do a very common example all the people talk about in Big Data, which is looking for fraud in banking system of some sort and we wrote the application two ways. We wrote it using Java on Hadoop using MapReduce.

We also wrote it using SQL extension, we call SQL pattern matching, which is a new part of the SQL language that we’ve implemented in the 12c version of our database. And we just sort of measured two key metrics here. One is – how many lines of code it takes to solve this problem and what you see here it took over 650 lines of code using Java MapReduce versus I think it’s on the order of 15 lines of the code using SQL. So number one, SQL is much, much more productive than having the code at a much more primitive level using in this case Java MapReduce.

And we also want to show we can run SQL in a very high performance massively parallel fashion. And in this example the run time also of the SQL version of the analytics was much, much less than 10 seconds of the run time. On Hadoop it was over 17 seconds and we actually ran the SQL on I think a couple processors and Hadoop was on like an 18 node cluster.

So the key message here is relational databases are constantly moving the bar and getting faster and faster and I think people who are thinking that, oh we are just going to put a little SQL engine on Hadoop and catch up in a couple of years to what relational databases are doing and have built over the last 20 years for doing high performance massively parallel SQL query I think are being a little optimistic about how soon they are going to get the parity there.

Okay. So let’s go to next slide. And here I just want to mention one thing that we’re doing in database 12c around in-memory processing. So for analytics the relational database engines have been producing very high performance massively parallel relational database engines for many years that are very good at cranking through, crunching through terabytes and terabytes of information.

But there’s been a sort of a breakthrough in the last few years in looking at column-store technologies and in particular now in-memory column-store technologies for making analytics even faster and in database 12c we just announced at OpenWorld a few weeks ago that we’re adding this in-memory column-store technology to database 12c. This again is going to give us another big leap forward in analytic processing in the relational database, in this case Oracle relational database. This of course, is going to be very exciting for customers doing analytics against Big Data and data warehousing.

So again, we’re sort of raising the bar of what relational databases can do here. We’re not standing still. Relational databases are moving forward very aggressively into the analytic space further. And again, people who are building SQL engines from scratch again have to think about adding even more technologies than they are thinking of to sort of match the capabilities of these relational databases.

Okay. So let’s go to the next slide. So what are some of the key differentiators for what Oracle is doing in the Big Data space versus other competitors? I think the big thing we are doing here is we are giving customers an integrated platform and an engineered platform. So a customer can just come to Oracle and say okay, I want a Big Data platform. They would order Oracle’s Big Data plans, Oracle’s Exadata platform.

Those two platforms or engineered systems can be very easily integrated together. We have both hardware integration [geared] by using incentive and networking technology across both platforms for making it very efficient to move information back and forth. And then we have software integration that we call our connectors that tie together the Hadoop platform with the Oracle database platform.

For example, one of the connectors are SQL connector like Oracle SQL reach-out into Hadoop HDFS and run SQL queries against the HDFS data. Another example connectors are Loader that should very efficiently move data from HDFS into the Oracle Exadata database. And then we have, of course, our whole array of technologies in Exadata that make it a great platform for doing Big Data analytics.

Next big differentiation that I just mentioned, we are adding a very high performance -memory columnar processing into our already very powerful relational database engine in Oracle that’s going to make us even much more outstanding interactive platform for doing analytics against Big Data.

Another big part of what we are doing here is Oracle has a huge ecosystem rounded of developers and ISVs and SIs who know and love the Oracle platform. They have huge sets of skilled consultants who know how to manage the platform. They know how to develop against it. We are sort of building on top of that as we go into the Big Data space and then finally, we’re giving our complete solution to our customers.

If you buy our Big Data Appliance, you buy Exadata and the connectors between those two. If you have any problem of course you just call up Oracle. We support you top to bottom, from hardware to software with any issues you have and of course we can provide you all the consulting help you need as well on top of that.

Okay. And then let’s just close, of course we have huge numbers of customers using our Big Data platform. I’ll just mention a couple here. UPMC is University of Pittsburgh Medical Center, which is one of the leading medical research centers in the fields of genetics and other health sciences.

They are a huge Oracle customer for Big Data, big user of Oracle Exadata. For example let’s go through here SoftBank, big telco, huge data warehousing users. So I think it’s big telco in Japan. They had actually also recently just been moving into the U.S. They actually moved all their warehousing technologies, of Teradata on to Exadata several years ago and they are very successful customer.

Thomson Reuters is our one of our big Big Data customers. They are using the Big Data Appliance and Exadata in their Big Data processing. StubHub is part of eBay. They do ticket reselling. They are using Oracle’s R Technology that I mentioned earlier for doing statistical analysis and predictive analytics.

And with that, I think we’ll move on to the Q&A session.

Earnings Call Part 2:

Advertisement