金皇朝|2

                              January 25, 2016

                              Kafka and more

                              In a companion introduction to Kafka post, I observed that Kafka at its core is remarkably simple. Confluent offers a marchitecture diagram that illustrates what else is on offer, about which I’ll note:

                              Kafka offers little in the way of analytic data transformation and the like. Hence, it’s commonly used with companion products.? Read more

                              January 25, 2016

                              Kafka and Confluent

                              For starters:

                              At its core Kafka is very simple:

                              So it seems fair to say:

                              Read more

                              September 28, 2015

                              The potential significance of Cloudera Kudu

                              This is part of a three-post series on Kudu, a new data storage system from Cloudera.

                              Combined with Impala, Kudu is (among other things) an attempt to build a no-apologies analytic DBMS (DataBase Management System) into Hadoop. My reactions to that start:

                              I’ll expand on that last point. Analytics is no longer just about fast queries on raw or simply-aggregated data. Data transformation is getting ever more complex — that’s true in general, and it’s specifically true in the case of transformations that need to happen in human real time. Predictive models now often get rescored on every click. Sometimes, they even get retrained at short intervals. And while data reduction in the sense of “event extraction from high-volume streams” isn’t that a big deal yet in commercial apps featuring machine-generated data — if growth trends continue as much of us expect, it’s only a matter of time before that changes.

                              Of course, this is all a bullish argument for Spark (or Flink, if I’m wrong to dismiss its chances as a Spark competitor). But it also all requires strong low-latency analytic data underpinnings, and I suspect that several kinds of data subsystem will prosper. I expect Kudu-supported Hadoop/Spark to be a strong contender for that role, along with the best of the old-school analytic RDBMS, Tachyon-supported Spark, one or more contenders from the Hana/MemSQL crowd (i.e., memory-centric RDBMS that purport to be good at analytics and transactions alike), and of course also whatever Cloudera’s strongest competitor(s) choose to back.

                              September 17, 2015

                              Rocana’s world

                              For starters:

                              Rocana portrays itself as offering next-generation IT operations monitoring software. As you might expect, this has two main use cases:

                              Rocana’s differentiation claims boil down to fast and accurate anomaly detection on large amounts of log data, including but not limited to:

                              Read more

                              July 7, 2015

                              Zoomdata and the Vs

                              Let’s start with some terminology biases:

                              So when my clients at Zoomdata told me that they’re in the business of providing “the fastest visual analytics for big data”, I understood their choice, but rolled my eyes anyway. And then I immediately started to check how their strategy actually plays against the “big data” Vs.

                              It turns out that:

                              *The HDFS/S3 aspect seems to be a major part of Zoomdata’s current story.

                              Core aspects of Zoomdata’s technical strategy include:? Read more

                              May 20, 2015

                              MemSQL 4.0

                              I talked with my clients at MemSQL about the release of MemSQL 4.0. Let’s start with the reminders:

                              The main new aspects of MemSQL 4.0 are:

                              There’s also a new free MemSQL “Community Edition”. MemSQL hopes you’ll experiment with this but not use it in production. And MemSQL pricing is now wholly based on RAM usage, so the column store is quasi-free from a licensing standpoint is as well.

                              Read more

                              March 5, 2015

                              Cask and CDAP

                              For starters:

                              Also:

                              So far as I can tell:

                              Read more

                              December 31, 2014

                              Notes on machine-generated data, year-end 2014

                              Most IT innovation these days is focused on machine-generated data (sometimes just called “machine data”), rather than human-generated. So as I find myself in the mood for another survey post, I can’t think of any better idea for a unifying theme.

                              1. There are many kinds of machine-generated data. Important categories include:

                              That’s far from a complete list, but if you think about those categories you’ll probably capture most of the issues surrounding other kinds of machine-generated data as well.

                              2. Technology for better information and analysis is also technology for privacy intrusion. Public awareness of privacy issues is focused in a few areas, mainly: Read more

                              October 5, 2014

                              Streaming for Hadoop

                              The genesis of this post is that:

                              Of course, we should hardly assume that what the Hadoop distro vendors favor will be the be-all and end-all of streaming. But they are likely to at least be influential players in the area.

                              In the parts of the problem that Cloudera emphasizes, the main tasks that need to be addressed are: Read more

                              June 16, 2012

                              Introduction to Metamarkets and Druid

                              I previously dropped a few hints about my clients at Metamarkets, mentioning that they:

                              But while they’re a joy to talk with, writing about Metamarkets has been frustrating, with many hours and pages of wasted of effort. Even so, I’m trying again, in a three-post series:

                              Much like Workday, Inc., Metamarkets is a SaaS (Software as a Service) company, with numerous tiers of servers and an affinity for doing things in RAM. That’s where most of the similarities end, however, as? Metamarkets is a much smaller company than Workday, doing very different things.

                              Metamarkets’ business is SaaS (Software as a Service) business intelligence, on large data sets, with low latency in both senses (fresh data can be queried on, and the queries happen at RAM speed). As you might imagine, Metamarkets is used by digital marketers and other kinds of internet companies, whose data typically wants to be in the cloud anyway. Approximate metrics for Metamarkets (and it may well have exceeded these by now) include 10 customers, 100,000 queries/day, 80 billion 100-byte events/month (before summarization), 20 employees, 1 popular CEO, and a metric ton of venture capital.

                              To understand how Metamarkets’ technology works, it probably helps to start by realizing: Read more

                              Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

                              Login

                              Search our blogs and white papers

                              Monash Research blogs

                              User consulting

                              Building a short list? Refining your strategic plan? We can help.

                              Vendor advisory

                              We tell vendors what's happening -- and, more important, what they should do about it.

                              Monash Research highlights

                              Learn about white papers, webcasts, and blog highlights, by RSS or email.

                                                          game

                                                          Buy a car

                                                          Foreign exchange

                                                          explore

                                                          Super League

                                                          Second-hand housing

                                                          video

                                                          Foreign exchange

                                                          explore