Financial services companies have always used technology for competitive advantage. Enterprise Data Mashups are no different. Enterprise Data Mashups uniquely access, transform, integrate and publish data from heterogeneous sources - structured, unstructured, and Web – to create real-time composite data services that can be leveraged by both applications and humans. In short, they combine the power of enterprise integration with the ease of use of Web mashups for several powerful applications such as Single View of Customer, Competitive and Market Intelligence, and Financial Data Aggregation.
The arrival of Enterprise Data Mashups
Over the past 25 years, the revolution of business information systems and their corresponding data stores have transformed how global organizations produce and use information. This has happened in three parallel realms over overlapping time frames – enterprise business applications, personal productivity and content applications, and the Web. Enterprise applications are heavily reliant on structured data stored in relational databases. Personal productivity and content applications contain mostly unstructured information stored in file systems. The Web, which has become the definitive information processing platform, B2B/B2C interface, and collaboration and publishing platform for individuals and groups in the form of blogs, reviews, wikis etc. encompasses a wide range of data formats from fully structured to fully unstructured and every shade in between.
All three types of information systems exist to increase organizational and personal efficiency and effectiveness through information flows. Ironically, the flow of information between these information system types was limited due to format and data structure differences and thus integration efforts were largely focused on structured data sources. With the emergence of holistic and semantic data integration technologies, innovative organizations have started to leverage the immediacy and relevance of unstructured and Web data to enrich enterprise applications and processes, and in the reverse scenario, bring real-time enterprise data to collaborative and edge of enterprise Web applications.
Enterprise Data Mashups refer to the discipline and technologies that enable the integration of structured, semi-structured (Web) and unstructured data sources. They are typically characterized by lightweight and flexible architectures to extract data from heterogeneous sources, build semantic relationships across them, and provide real-time access to the composite virtual data services. They can also be used to read or write data and orchestrate data integration workflows.
Enterprise Data Mashups are a technology innovation driven by business imperatives – to increase information agility and transparency while reducing costs and the shift in the Web 2.0 generation user behavior that forces companies to interact via email, web forums, blogs, review sites etc. rather than through structured mechanisms.
Leaders and followers
Business leaders, CIOs and information architects have on their agenda a key objective – to make information a strategic asset while reducing costs. But what makes information truly strategic? It is the ability to gain unique insights or provide relevant and comprehensive information in a timely manner to promote innovation, improve customer service, streamline operations, reduce operating costs, enable better decisions, and expose opportunities and threats before it is too late. However this would not be possible if the vast majority of information available to decision makers is ignored and information management initiatives focused only on a limited set of enterprise applications and databases.
Take the example of customer information. Many companies still have a disconnected view of their customers across products, divisions, applications, and time and are struggling to unify these many fragments into a complete picture. Some have succeeded in integrating internal structured data sources to provide a “Single View” of customers. This is a step forward, but still falls short of what is needed. The leaders and innovators in an industry have managed to assemble a truly “Holistic View” of the customer including competitive choices available to each specific customer, customer feedback, preferences and lifestyle information that might indicate future sales opportunities or provide ideas for product improvement. They have done so by merging and building relevance across structured customer application data, unstructured call notes and emails, competitor and public websites and user-generated data in blogs, reviews, etc.
Another example is Competitive and Market Intelligence. So much information is now available on the Web and in published documents. On the one hand researching companies, markets, regulations, and products is much easier. On the other, there is potential for much noise rather than true intelligence. But ignoring this wealth of information is highly risky since other leading financial services companies have found a way to automatically harness Web information and matching it with internal data and taxonomy filters to make it relevant. Enterprise Data Mashup tools relieve the intelligent analyst of repetitious data gathering tasks and allow him or her to focus on value-added analysis to spot trends, opportunities and risks.
It is also important to take into account the cost-benefit of information integration. IT departments have become wary of multi-year integration projects with dubious ROI. It is therefore critical to ensure “pay-as-you-go” value in implementing data integration. It is here that Enterprise Data Mashups really shine because they provide immediate access to integrated data without incurring the cost of moving it.
In short, organizations that have recognized the need to “mashup” structured, unstructured and Web data from internal and external, public and private sources, and use this information to create competitive advantage will be the future leaders.
Thus, Enterprise Data Mashups serve the business and IT agenda by making information a strategic asset while reducing costs. This is done by:
• Providing access to relevant business information in real-time from any source, anywhere
• Building relationships across heterogeneous data structures to provide new insights not possible before
• Creating reusable data services composed from data mashups that can feed enterprise and web applications, portals and dashboards
• Leveraging existing information assets and technologies with a flexible data federation model
The enterprise data mashups platform
The Enterprise Data Mashups platform has emerged to bridge the gap between Web 2.0 and Enterprise 2.0 and provide data integration across any source. It combines in a single platform the technologies of Enterprise Information Integration (EII) or data federation (for structured data), Web extraction and automation (for semi-structured data) and Search/Indexing (for unstructured data) and additional functionality to build relationships across them to create composite real-time data services.
Four fundamental pillars have materialized to define the capabilities of a true Enterprise Data Mashups platform:
• Any Data Source: Data engines that automatically navigate and extract data from all 3 data types – structured, unstructured, and Web
• Build Relationships: Unified modeling and execution environment to normalize, transform, semantically match and relate data across heterogeneous source types using common metadata and semantics
• Multi-mode Access and Service-Oriented: Access to the combined data via web service, message bus, API, query, or search. Create reusable SOA data services
• Real-time & Interactive: Real-time read/write access with enterprise-class reliability, performance and scalability – even for web and unstructured sources
More than the sum of its parts
To accomplish the goals of holistic data integration, any Enterprise Data Mashups platform must be more than the sum of key capabilities found in separate technology stacks for EII, Web extraction and Search/Indexing. These three technologies have evolved with very different paradigms, data models, and methodologies. Enterprise Data Mashups must improve on each area in order to combine these sources in a unified data model, while preserving the strengths and accommodating the limitations of each data type. The following are some of the areas where Enterprise Data Mashups have improved on the component technologies to deliver its promise:
• Extended Data Model: support relational and hierarchical data and index files
• Advanced Web Automation: Support crawling as well as surgical navigation using authentication, forms, pop-ups etc. to access only the required data.
• Hidden Web Data: Iterate through navigation and extraction sequences to build a database out of the data hidden behind Web forms.
• Structuring Unstructured Data: Convert an index of unstructured content into summary data that can be queried like a database using keyword and taxonomy filters
• Data Consistency: Normalize, cleanse, transform and rewrite data to be merged
• Optimized Data Federation and Execution: Optimize query execution across heterogeneous sources by remaining aware of the idiosyncrasies of each source and using asynchronous and parallel execution for performance.
• Multi-mode Access: Support query for databases, service request/reply for a web service, search for unstructured content etc.
• Automatic Web Integration Maintenance: Use example-based inference techniques to automatically adapt to navigational and format changes on Web sites.
Applications of Enterprise Data Mashups
Due to its horizontal nature, Enterprise Data Mashups can add value to many different business uses. As organizations become aware of the unique capabilities of Enterprise Data Mashups to merge structured and unstructured information, they are discovering new uses and even new business models to enhance their competitive position. The following are some common scenarios where data mashups have been used successfully to enhance business value:
Holistic view of single entity
Currently, information about a customer, supplier, product or project is dispersed among various systems and sources. They must be brought together into a “Holistic View” to conduct business more efficiently and trigger innovation.
Competitive and market intelligence is often gathered manually and sporadically making it less useful in daily business decisions. Automating the extraction of context-sensitive competitive information to integrate with internal applications can create actionable insights.
Operational business intelligence
Timely aggregation, analysis and presentation of the most relevant data on a real-time dashboard helps both individuals and managers monitor performance and make daily operational decisions in key business processes such as sales, trading, product development, risk management and compliance.
External watch for opportunities and threats
Web 2.0 has created an explosion of information that grows every second – on companies, markets, patents, regulations, competitors, and customer reviews/blogs. Enterprise Data Mashups can automate navigation, extraction and maintenance of relevant snippets of information using business rules unique to each constituency. This allows business users to focus on analysis, exceptions and alerts instead of tedious data collection.
B2B integration and web automation
Often, the only way to access or share information between departments or business partners is by periodically logging in through a web interface or by emailing back and forth content such as price lists, promotions, reference data, order status, etc. Enterprise Data Mashups provide a flexible solution to reliably extract or post data via the Web and integrate with internal enterprise systems.
Information extraction, consolidation and organization
Knowledge management, research, account aggregation, price comparisons – all have the common problem of constantly collecting, organizing and inter-relating information for general use. Search tools can find the information, but Enterprise Data Mashups are uniquely capable of building relationships between multiple source/data types and supporting structured queries against the composite information.
The goal of making information a strategic asset can no longer be limited to the subset of structured data contained in enterprise data bases and applications. Financial institutions are forced to deal with the reality of the information explosion driven by Web 2.0 and user-generated content. More useful and relevant data now resides outside enterprise applications than within it. But currently enterprise applications are unable to access, let alone integrate, this information.
CIOs and executives that recognize the unique business value and competitive advantage that can be derived from merging Enterprise with Web 2.0 information will be the next business leaders. Companies that do not learn to leverage this new generation of information will be left behind. Denodo, the recognized innovator of Enterprise Data Mashup technology and applications that Gartner named “Cool Vendor” in 2007, offers robust technology and experience working with leading financial services companies to address a specific business pain point while building towards a broader solution.
Author: Suresh Chandrasekaran, Denodo Technologies