Have you delved into your data lately?

 

Connecting state and local government leaders

Instead of just reporting on public-sector data, shops should be exploring it. This is particularly true in government, where very large data sets can benefit from cutting-edge visual data exploration technologies.

XXXSPLITXXX XXXSPLITXXX XXXSPLITXXX

For Very Big Data shops -- in either the public or private sectors -- data exploration can be a first step in the direction of getting one's data management house in order. It can likewise yield high-value insights along the way.

At a basic level, says Andrew Cardno, chief technology officer (CTO) with data visualization specialist BIS2 Inc., information analysis is a pretty similar proposition in both private- and public-sector shops.

"They're very similar problems, actually. [In both cases] you have a massive data out-load, [while] the analytical techniques that are being applied mostly focus on small pieces of data or [on] known trends in the data and things like that, instead of high-dimensional exploration," he says.

"The comparison I'll often use is that when you have known data and known relationships, you can do reports. When you have known data and unknown relationships, you're doing data exploration."

Reporting, Cardno says, is where most shops are. Data exploration, on the other hand, is where most shops need to be. "When you're doing data exploration, you really need to be able to see more dimensions," he explains, arguing that the ability to intelligently display data in several dimensions should likewise distinguish today's cutting edge visualization technologies.

Cardno cites a visual exemplar -- namely, Charles Minard's chart of the French Grande Armee's disastrous invasion of Russia -- that's a favorite among data visualization advocates. "In this graphic, there are something like six or seven dimensions of data. You get the whole picture. You see how it works. Any questions you ask after looking at that graphic are in the context of having understood the whole picture," he explains.

"It's a common problem for anyone who has a master set of data: the value is not normally in the things you know about it; the value is in the things you don't know about it, [in] the things that you haven't yet found."

Out of order

How does a shop -- public- or private-sector -- achieve Minard-like visual displays when its internal data integration plumbing is, in many cases, largely siloed, unprofiled, uncleansed, unstandardized, and often not entirely structured? If private-sector shops are behind the curve when it comes to practices such as enterprise information integration, enterprise-wide data quality, or master data management, how are public-sector organizations doing?

Consider the case of a consultant with a prominent government services firm. This person tells a story about one of his clients: a government agency that's preparing a large multi-year study involving tens of thousands of different "widgets." Right now, this consultant says, this agency is in the process of populating its database of prospective widgets. At the outset, it's collecting data -- via a multi-page information form -- about all of the widgets that have signed up to participate in its study. How could this agency determine that the information it was feeding its database was both consistent and accurate?

Its solution is at once simple and breathtakingly hamfisted: the agency tasked two human beings to manually enter information into its repository. It plans to compare this information to ensure the data is "accurate."

This is a hyperbolic case, but it gets at a fundamental problem: if cutting-edge analytics or analytic technologies depend on reliable (and increasingly timely) feeds of clean, consistent, and accurate data, don't shops first have to get their data management houses in order before they step up to advanced analytic technologies, to say nothing of cutting-edge data visualization?

The answer, according to Cardno, is both yes and no.

If an organization wants to make effective use of most data visualization tools, he argues, it should expect to have a good understanding of its data.

This isn't necessarily a self-serving statement. According to business intelligence (BI) and data warehousing (DW) thought-leader Mark Madsen, a principal with consultancy Third Nature, Inc., data visualization tools tend to be as sensitive to the timeliness, accuracy, or consistency of data as any other analytic technology. "If you want to work on really large data sets [with most of these tools], you have to summarize them first. They use really smart techniques on the front end married to archaic plumbing from the 1980s and 90s on the back end," he says.

On the other hand, Cardno argues, some kinds of visual analytic technologies aren't as sensitive to the consistency or the quality of the data they're consuming. This is often the case with visualizations that involve huge data sets, he says.

Cardno's company, BIS2, specializes in visualization problems involving huge data sets. It prescribes Cardno's Super Graphics as a prescription for the limitations of so-called "traditional" data visualization technologies. "The nature of those [traditional] graphics -- with some exceptions like maybe scatterplots -- is that they depend on knowledge of the dimensionality of the data. They have a certain kind of expectation or understanding of information that normally comes with them. It's a very broad generalization," he maintains. "The nature of the SuperGraphic is that you can comprehend vast amounts of data in different dimensions at a glance."

Cardno and BIS2 aren't without their detractors. Data visualization thought-leader Stephen Few, for example, has described Super Graphics as "dysfunctional visualizations." Few has argued that Cardno's Super Graphics distort or produce "inaccurate representation[s]" of source data.

Cardno doesn't dispute this. He cites one of Few's criticisms -- the claim that Super Graphics distort the time dimension by depicting (in one example) the most recent calendar year as of a longer duration than others -- as a case in point. On anything but a quantum scale, 2011 isn't longer or shorter than 2010; but the distortions of the Super Graphic -- which are a product of the circular visual metaphor that Cardno employs -- help inform in part because they distort.

Cardno has his detractors, but he also has plenty of defenders, too. At last month's TDWI Winter World Conference in Las Vegas, for example, Madsen described BIS2's Super Graphics as "hyper-advanced." He and co-presenter Jos van Dongen (a principal with Dutch BI and DW consultancy Tholis) invited Cardno to demonstrate Super Graphics to attendees of their day-long BI and DW technology course.

Madsen endorses at least one of Few's criticisms -- chiefly, that Super Graphics aren't immediately intuitive -- even as he suggests that Cardno's critics are missing the point. "I've had my criticisms with his interface in that [Super Graphics are] not necessarily immediately intuitive; you first kind of have to learn what technique is being used before you can apply it," Madsen comments. "[Super Graphics] are geared toward a very specific kind of problem which is more than uni-dimensional or two-dimensional data, and very large sets of data being viewed in their entirety at once. It's really a question of data exploration."

Unlike most BI technologies (and even many data visualization tools), BIS2's Super Graphics are designed to be highly interactive. "It's fully-interactive over the data set, so you're showing on some of those interfaces several hundred million data points and, say, four or five dimensions of data," Madsen notes. "[Proponents of] conventional data visualization tends to think of reduced data sets, non-interactive viewing, and fairly simplistic techniques. That's great for the basics, but there are problem sets for which that doesn't work."

Tailor-made for the public sector?

The most obvious application is data exploration involving very large data sets. At present, BIS2's biggest reference customers are concentrated in two Very Big Data market segments: the gaming industry -- in which casinos are sifting through hundreds of terabytes of data about individual slot machines (involving multiple dimensions such as time, location, volume, and profit/loss), dealers, individual gamblers, and so on -- and the airline industry.

Cardno says his Super Graphics techniques are applicable to other markets, too -- including retail, financial services, and, of course, government.

Cardno doubts that government shops are any more behind the curve data management-wise than are private-sector firms. At the same time, he argues, there are data visualization techniques appropriate to different kinds of problems.

"No matter how hard you try, when it comes to massive amounts of data entry, the data is always going to be error-prone. It may have large systematic errors, it may have errors in unexpected places. It's actually hard to anticipate just how or where [errors] will occur," he says. "You need the data to have a feedback loop. The day that people can see the data and understand the data and interact with it, is the day that data can start to have quality. Before you've done that, you don't know anything about them."

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.