This document contains our general views on the subject and we have used some industry data found on the web and in Wikipedia.


An ontology formally represents knowledge as a hierarchy of concepts within a domain, using a shared vocabulary to denote the types, properties and interrelationships of those concepts.

Ontologies are the structural frameworks for organizing information, and are used in artificial intelligence, the Semantic Web, systems engineering, software engineering, biomedical informatics, library science, enterprise bookmarking and information architecture as a form of knowledge representation about the world or some part of it. The creation of domain ontologies is also fundamental to the definition and use of an enterprise architecture framework.

As it Relates to the Big Data Trend

Ontology claims to be to applications what Google was to the web. Instead of integrating the many different enterprise applications within an organization to obtain, for example, a 360-degree view of customers, Ontology enables users to search a schematic model of all data within the applications. Users extract relevant data from a source application, such as a CRM system, Big Data applications, files, warranty documents, etc. These extracted semantics are linked into a search graph instead of a schema to give users the results needed.

Ontology uses a unique combination of an inherently agile, graph-based semantic model and semantic search to reduce the timescale and cost of complex data integration challenges.

Ontology gives users a different approach in using enterprise applications, removing the need to integrate the different applications. It allows users to search and link applications, databases, files, spreadsheets, etc., anywhere. The premise of Ontology is very interesting, because in the past years a vast amount of enterprise applications for various needs and with various requirements have been developed and used by organizations. Integrating these applications to obtain a company-wide integrated view is difficult, expensive and often not without risks.

Why is it important? It eliminates the need to integrate systems and applications when looking for critical data or trends.

How is it applied, and what are the important elements that make it all work?

Ontology uses a unique combination of an inherently agile, graph-based semantic model and semantic search to reduce the timescale and cost of complex data integration challenges. Ontology is rethinking data acquisition, data correlation and data migration projects in a post-Google world.

Enables the Semantic Web

The Semantic Web

The semantic web provides a common framework that allows data to be shared and reused across application, enterprise and community boundaries.

While its critics have questioned its feasibility, many others argue that applications in industry, biology and human sciences research have already proven the validity of the original concept.

The main purpose of the semantic web is driving the evolution of the current web by enabling users to find, share and combine information more easily. Humans are capable of using the web to carry out tasks such as finding the Estonian translation for “twelve months,” reserving a library book, and searching for the lowest price for a DVD. However, machines cannot accomplish all of these tasks without human direction, because web pages are designed to be read by people, not machines. The semantic web is a vision of information that can be readily interpreted by machines, so machines can perform more of the tedious work involved in finding, combining and acting upon information on the web.

The Semantic Web, as originally envisioned, is a system that enables machines to “understand” and respond to complex human requests based on their meaning. Such an “understanding” requires that the relevant information sources be semantically structured.

Often the terms “semantics,” “metadata,” “ontologies” and “semantic web” are used inconsistently. In particular, these terms are used as everyday terminology by researchers and practitioners spanning a vast landscape of different fields, technologies, concepts and application areas. Furthermore, there is confusion with regard to the current status of the enabling technologies envisioned to realize the semantic web.

Semantic Web Solutions

The semantic web takes the solution further. It involves publishing in languages specifically designed for data: Resource Description Framework (RDF), Web Ontology Language (OWL) and Extensible Markup Language (XML). HTML describes documents and the links between them. RDF, OWL and XML, by contrast, can describe arbitrary things such as people, meetings or airplane parts.

These technologies are combined in order to provide descriptions that supplement or replace the content of Web documents. Thus, content may manifest itself as descriptive data stored in Web-accessible databases, or as markup within documents (particularly, in Extensible HTML [XHTML] interspersed with XML or, more often, purely in XML, with layout or rendering cues stored separately). The machinereadable descriptions enable content managers to add meaning to the content, i.e., to describe the structure of the knowledge we have about that content. In this way, a machine can process knowledge itself, instead of text, using processes similar to human deductive reasoning and inference, thereby obtaining more meaningful results and helping computers to perform automated information gathering and research.


The term “semantic web” is often used more specifically to refer to the formats and technologies that enable it. The collection, structuring and recovery of linked data are enabled by technologies that provide a formal description of concepts, terms and relationships within a given knowledge domain:

  • Resource Description Framework (RDF), a general method for describing information
  • RDF Schema (RDFS)
  • Simple Knowledge Organization System (SKOS)
  • SPARQL, an RDF query language
  • Notation3 (N3), designed with human-readability in mind
  • N-Triples, a format for storing and transmitting data
  • Turtle (Terse RDF Triple Language)
  • Web Ontology Language (OWL), a family of knowledge representation languages
  • Rule Interchange Format (RIF), a framework of web rule language dialects supporting rule interchange on the Web

Descriptions of Key Components

  • XML provides an elemental syntax for content structure within documents, yet associates no semantics with the meaning of the content contained within. XML is not at present a necessary component of semantic web technologies in most cases, as alternative syntaxes exist, such as Turtle. Turtle is a de facto standard, but has not been through a formal standardization process.
  • XML Schema is a language for providing and restricting the structure and content of elements contained within XML documents.
  • RDF is a simple language for expressing data models, which refer to objects (“web resources”) and their relationships. An RDF-based model can be represented in a variety of syntaxes, e.g., RDF/ XML, N3, Turtle, and RDFa. RDF is a fundamental standard of the Semantic Web.
  • RDF Schema extends RDF and is a vocabulary for describing properties and classes of RDFbased resources, with semantics for generalized hierarchies of such properties and classes.
  • OWL adds more vocabulary for describing properties and classes; among others, relations between classes (e.g., disjointness), cardinality (e.g., “exactly one”), equality, richer typing of properties, characteristics of properties (e.g., symmetry), and enumerated classes.
  • SPARQL is a protocol and query language for semantic web data sources.
  • RIF is the W3C Rule Interchange Format. It’s an XML language for expressing Web rules which computers can execute. RIF provides multiple versions, called dialects. It includes a RIF Basic Logic Dialect (RIF-BLD) and a RIF Production Rules Dialect (RIF PRD).

Current State of Standardization

Well-established standards:

  • Unicode
  • Uniform Resource Identifier
  • XML
  • RDF
  • RDFS
  • Web Ontology Language (OWL)
  • Rule Exchange Format (RIF)

Banks led the way in standardizing data collection, retrieval and staying compliant.

Too much of a good thing can be bad, and this is especially true when it comes to data. Companies, particularly banks, have a wealth of information at their fingertips, but until recently they lacked a basic method to harness this data and put it to better use.

The Securities and Exchange Commission recognized this untapped potential and, in 2009, mandated that all publicly held companies disclose their financial information using eXtensible Business Reporting Language (XBRL) — a standardized method of data collection and reporting. XBRL could facilitate more accurate comparisons across companies to improve business performance, investment analysis and decision-making. More specifically, in 2010, the Securities and Exchange Commission (SEC) in the U.S., the United Kingdom’s HM Revenue & Customs (HMRC), and Companies House in Singapore had begun to require companies to use XBRL, and other regulators were following suit. The SEC’s deployment was launched in 2010 in phases, with the largest filers going first: by 2013, the large foreign companies which use International Financial Reporting Standards (IFRS) were expected to submit their financial returns to the SEC using XBRL. In the UK in 2013, both HMRC and Companies House accepted XBRL in the iXBRL format.

One approach towards making data integration easier is Model Driven Reporting, or ModelDR.

XBRL was adopted by the Ministry of Corporate Affairs (MCA) of India for filing financial and costing information with the Central Government.

However, at this point, most companies see the XBRL mandate as a compliance headache rather than a value-added tool for analysis. An exception is the banking industry, which presents an excellent example of how XBRL can ease the Big Data headache through the creation of high-quality, consistent data.

Model Driven Reporting

Business reporting in large enterprises has issues and challenges:

  1. Order of magnitude increases in volume and complexity, especially for the finance sector
  2. Inconsistent and incompatible definitions, making reports incomparable and misleading
  3. The most critical information is hidden by too much low value information
  4. The process to generate reports is manual and error prone

What is ModelDR?

ModelDR is a model-driven tool for business report design and management. It separates the logic of what is required from the physical implementation, making both more efficient.

ModelDR specifically leverages the power of the business modeling tool MagicDraw UML, from NoMagic. MagicDraw is used by the world’s largest banks, governments, military establishments and many other enterprises. It enables the business domain to be modeled and for the autogeneration of artifacts like specifications, code and tests.

ModelDR extends this capability with:

  • A business report design framework, to the XBRL standard, from which reports can be designed and specified
  • A rules engine to enforce good report design
  • Design templates for common business reports, such as EU and SEC regulatory reports
  • Automatic imports of legacy reports, for reengineering
  • Automatic generation of business report specifications and code
  • Cross linking of reports and business domain data for guaranteed correctness

By supporting XBRL, ModelDR supports the global standard for business reporting, enabling integration with the great majority of the world’s regulators.

What does ModelDR give me?

ModelDR empowers data architects, business analysts, testers, auditors, finance managers and regulatory reporters with

1. Guaranteed data and report integrity

The hundreds and thousands of reports within an enterprise are complex. The supporting databases and systems are equally complex. To fill out the reports with data requires mapping them to the business systems. ModelDR:

  • Puts the reports and database designs in one place
  • Maps between the reports and the databases
  • Guarantee the quality of your reporting through the integrated validation rules

2. Agile development and modification of business reports

  • Being a graphical design tool means rapid and low cost design
  • With the visualization and design power you can design your own taxonomies for your enterprise

3. Low cost, correct systems

The structured nature of model-driven design makes it possible to automate many functions downstream of design. Combining it with the MagicDraw Model Driven Testing plugin, you can automate the generation of report specifications, reporting code, data loading, test cases, test data and test running.

4. User friendly understanding and design of complex business reports

ModelDR visualizes report taxonomies in a way other tools cannot. The structure is exposed and can be easily understood by many who are not experts. You can see the entities, dimensions, rules and mappings, for example in IFRS, SEC and US GAAP, as well as inhouse reports. The tool helps your enterprise visualize new report designs. The learning curve is reduced because the design is expressed visually.

ModelDR input from Greg Soulsby.

A simple analogy.

Let’s first imagine a real orchestra. What would it sound like without a conductor — just a lot of annoying noise? By adding a conductor, what was annoying can quickly become very enjoyable. Now we get a bit more techy, but remember the role the conductor plays as we go along.

At a very high level, a crucial aspect of service oriented architecture (SOA) is service orchestration. Enterprise systems and integration projects designed according to SOA principles depend on successful service orchestration. Finding a platform with enhanced service orchestration capabilities, then, is a high priority for enterprises looking to build their systems according to SOA. Before going on, let’s make sure we are on the same page when it comes to SOA.

With SOA, functionalities are expressed as a collection of services rather than a single application.

SOA is an approach to developing enterprise systems by loosely coupling interoperable services — small units of software that perform discrete tasks when called upon — from separate systems across different business domains. SOA emerged in the early 2000s, offering IT departments a way to develop new business services by reusing components from existing programs within the enterprise, rather than writing functionally redundant code from scratch and developing new infrastructures to support them. With SOA, functionalities are expressed as a collection of services rather than a single application, marking a fundamental shift in how developers approach enterprise architecture design.

Example: A Bank Loan

To get a better understanding of service orchestration, let’s take a look at a bank loan, for example. A loan broker wants to make a loan request on behalf of a customer and uses an automated Loan Request Service. The broker accesses the Loan Request Service in the enterprise system to make the initial loan request, which is sent to an orchestrator (conductor) that then calls and invokes other services in the enterprise, partner systems and/or the cloud to process that request.

The individual sub-services involved in the loan request include

  • a service to obtain credit scores from a credit agency,
  • a service to retrieve a list of lenders,
  • a service to request quotes from a bank service, and
  • a service to process quotes with the data from the other services.

Together, the orchestrated services comprise the Loan Request Service, which then returns a list of quotes from potential lenders to the broker who made the original request.

As the above example illustrates, service orchestration is a fundamental aspect of successfully implementing SOA. In a truly service-oriented architecture, new applications are created by new orchestrations of existing services — not by writing new code.

If it were only that simple

On the surface, service orchestration and SOA are relatively simple concepts. For enterprises faced with integration challenges, skyrocketing IT budgets and increasingly complex infrastructures, building new applications with granular and reusable software components is an understandably attractive approach to creating more agile and competitive systems and reducing time to market.

Service orchestration and SOA, however, can be difficult to achieve without the right set of tools. In its early days, CTOs of large companies eagerly adopted SOA and went about implementing it with a rip-and-replace model. Such an approach resulted in high financial costs as well as major time investments, since it often required developers to orchestrate services programmatically (i.e., write new code), defeating the ultimate purpose of adopting SOA.

What was needed was a simpler and more flexible way to perform service orchestrations and implement SOA. The enterprise service bus (ESB) emerged as the go-to mechanism for service orchestration and SOA. Now there are a number of ESB platforms in the market today, but if you buy in to the fact that embracing SOA eliminates writing code and allows for extensive reuse of components in existing programs, why stop there? We would suggest an ESB platform that could do all that has been mentioned so far, plus let your business stakeholders, IT stakeholders and IT operations stakeholders all be on the same page. This would eliminate false starts and figure pointing.

FIBO is a direct model of all the financial instruments that a bank or trading company might have.

The Dodd–Frank Wall Street Reform and Consumer Protection Act was signed into federal law by President Barack Obama on July 21, 2010, bringing the most significant changes to financial regulation in the United States since the regulatory reform that followed the Great Depression. It made changes in the American financial regulatory environment that affect all federal financial regulatory agencies and almost every part of the nation’s financial services industry.

There is much debate on whether Dodd-Frank accomplishes the stated goal. One thing we can be sure of, it is very complex and costly to implement.

The overall goal of the legislation is to promote the financial stability of the United States by improving accountability and transparency in the financial system, to end “too big to fail,” to protect the American taxpayer by ending bailouts, to protect consumers from abusive financial services practices, and for other purposes. There is much debate on whether Dodd-Frank accomplishes the stated goal. One thing we can be sure of: it is very complex and costly to implement. Each page of Dodd-Frank law translates to 10 pages of rules in practice. It has been almost four years, and we have not begun to get a handle on this problem.

Some Ways to Deal with Compliance

We have discussed FIBO (Financial Industry Business Ontology) and XBRL (eXtensible Business Reporting Language) — both of which help enterprises deal with financial reporting and financial compliance regulations imposed on them since 2009.

What is the relationship between these two standards? FIBO is a direct model of all the financial instruments that a bank or trading company might have. The financial reports will be partially dependent upon these, but they are also dependent upon non-financial aspects of the company: sales, costs, inventory, etc. While FIBO attempts to be a complete model of all financial assets at the current time, an XBRL report represents a summary of the business activities over a specified time period, and includes standard aggregations of values calculated in specific ways. These two standards are different and complementary.

XBRL relies upon taxonomies to provide the definitions of the values in the report. The SEC provides a standard base taxonomy for generally accepted accounting principles, and that taxonomy can be extended by industry groups and by individual companies. FIBO provides a standard semantic representation of the financial instruments with precise meanings of the assets, and the relationships that exist with all other assets. Thus a derivative will have a clearly specified relationship to the assets it is derived from, and a credit default swap will be tied in different ways to its related assets. In both cases common terms and domain-specific terminology are required.

What Ontology and FIBO Provide?

Many understand that you can’t manage what you can’t measure. Assessing the risk of a collection of assets is impossible when the relationship between those assets is vague. Assessing the value is equally difficult. Basically, the banks had no way to really know what they had on hand, and they lacked a standard that would allow them to collect this understanding. If they did know internally what they had, they lacked a way to communicate that to other businesses.

Regulations and Meanings

A second and more pervasive problem is about regulations of the industry. Those regulations must be described somehow in terms that carry the same meaning to all the main players. The existing laws are flawed by ambiguous category definitions, particularly when different organizations use terms differently. Some organizations will purposefully manipulate the language to game the regulations, but many of the existing derivative instruments were simply so complicated that it was not clear at all where they should fit and how the laws applied.

That is why understanding and utilizing ontology is so critical. Utilizing ontology, and specifically the FIBO implementation, allows one to be able to ask questions about the assets that a company holds, and get correct answers. How many of the assets are taxable? What is the tax liability? More interesting questions are along the line of “if this asset became worthless today, which of these other assets would have their value affected and what other things would be affected in turn by that?” To run the business, and to regulate it, you need to get accurate answers to these kinds of questions quickly and consistently.

Given a standard ontology, the rules and regulations can be expressed in terms of these semantics as well. Financial institutions go through all their assets and assign semantic meaning by representing those assets in the ontology. The resulting data set can be queried with SPARQL, and a clear, meaningful answer results. This FIBO semantic map will return the same result for the bank and for the regulators. It will provide tremendous clarity both to the leadership of the bank, and to the regulators.

The need

Getting an Amber Alert far and wide as soon as possible is critical to successfully recovering a child. In fact, pick any emergency event and getting information to the first responders who need it as soon as possible is critical to saving lives. All of these events are obviously time sensitive, and systems are needed which are architected to consume events and trigger the appropriate event processing. As events come fast and furious in a full-blown emergency, it is important that the system design reflects an ability to separate the event logging/recording from the event processing to avoid losing or dropping events.

To be successful any event system must be able to consume events from other event systems, and must also be able to push events into other event systems.

There are many different event processing and notification systems, and it is unreasonable to expect organizations to adopt one standard. To be successful, any event system must be able to consume events from other event systems, and must also be able to push events into other event systems. The ability to “ripple” events across a disparate network of systems is what this is all about. It is left as an exercise for the reader to identify such a product. Think XML and the need to move data across as many as 300 different systems across all branches of government including justice, public safety, emergency and disaster management, intelligence, and homeland security.

Adding social media to the mix

In a public emergency, the power of social media cannot be ignored. We know how Twitter has been used by authorities to locate people in an emergency. Why not use social media to notify the public? It isn’t a big jump to view Twitter as a public event notification system. One could argue that each “tweet” is in fact an event. Using hashtags, one can spread a message far and wide. Tweets are small (like event notices) and support location. For some types of time-sensitive events, notifying the public is critical. It would be hard to come up with a more costeffective way than to simply embrace the existing infrastructure of social media.

So What is Needed?

The ability to share real-time data across agencies, data silos, non-compatible systems, sensors and SCADA devices, leveraging social media and mobile devices, presents significant challenges. Having access to a middleware platform that easily provides the integration and connectivity services required would also help reduce the complexity.

This does not factor in the need to put in place security frameworks to ensure that the data is not tampered with in flight and is only accessed by entitled users. This process is clearly complex and costly in the case of one information provider and one information consumer, but grows geometrically more complex and costly as the number of bi-lateral relations increases.

Worse, because the encoding and security measures must be coded into the applications, change management and testing become prohibitive. All of this sounds and is daunting. What is needed is an industry-wide sharing standard. A data sharing standard would eliminate much of the complexity, as would a middleware platform that can play the role of an Enterprise Service Bus that provides out of the box connectivity between these incompatible systems and data sources.

Introducing NIEM

The National Information Exchange Model (NIEM) is a multi-agency framework which defines how to share data. To say that the vision of NIEM is ambitious is an understatement. However, the benefit if successful is massive, and could greatly increase the efficiencies of communication across and outside of government.

niem tagNIEM was launched in 2005 for the purpose of developing, disseminating and supporting information-sharing standards and processes across all branches of government including justice, public safety, emergency and disaster management, intelligence and homeland security. Through NIEM, government agency data silos can be bridged to facilitate the sharing of information between them.

NIEM provides a common language, a universal XML-based vocabulary and a framework by which state, local, tribal and federal government agencies may share data in both emergency and day-to-day operations. However, in practice NIEM data must be wrapped in a messaging protocol like SOAP for exchange across the Internet. Proper exchange therefore requires an ability to onboard/offboard NIEM-encoded data into a communications format suitable for the Internet — and then provide the framework to control how that communication is routed, secured and validated to prevent either transmission error or data compromise.

The interesting thing about NIEM is that this framework is not trying to dictate data models that all must use to participate, as other projects attempt to do. Rather, NIEM leaves the definition of these data models to experts through the definition of IEPD’s (Information Exchange Package Documentation).

This delegation of model definition through IEPD’s is very different from other approaches, and one would argue increases its chance for success.

NIEM — The data exchange protocol is XML

niem xmlThe number of systems that must interoperate with the data of NIEM is mind-boggling, encompassing everything from local law enforcement to 911 systems, to fire, to Amber Alert warnings.

Rather than define a new data encoding, the team behind NIEM chose an open standards approach that can be embraced by all. It should come as no surprise that the data exchange protocol for NIEM is XML!

When first looking at NIEM, one can get discouraged as the data model itself is absolutely massive with thousands of different data types! Looking closer it was evident that to perform a particular task — say “Amber Alert,” “Commercial Vehicle Collision” or “Track IEPD” — one only has to understand the classes that are relevant to what you are trying to do, and these are relatively easy for any domain expert to understand.

An Ideal Middleware Platform

Different organizations use different systems, and this is not going to change. Indeed, we do not want this to change, as it is important for all organizations to be able to adopt the system that does the best job for them and to be able to continue to innovate.

One might take a look at the Cameo E2E Bridge.

SAP® is one of the most widely used enterprise resource planning (ERP) solutions today, enabling organizations across various industries to optimize critical business functions such as accounting and financials, human capital management, supplier relationship management, enterprise performance management, and many others.

In fact, there are over 265,000 installations worldwide. SAP is vital to your business and serves as the cornerstone of your IT infrastructure. Over the years you’ve added capability around it, in the form of best-of-breed solutions and SaaS applications. Now you find yourself with a host of applications and data sources that span on-premises and cloud environments. Added to the challenge are integrating with new mobile applications and dealing with integrating BAPI and REST.

You need an SAP integration solution to make them all work together, to reap the rewards of the investments you’ve made.

SAP is vital to your business and serves as the cornerstone of your IT infrastructure.

One of the major challenges facing companies which rely on SAP is integration. Without proper integration between SAP applications and non-SAP applications, companies fail to fully automate and optimize their business processes.

In the rest of this blog, we’ll take a look at the most common SAP integration scenarios and typical application integration challenges.

SAP Integration Use Cases

Although there are numerous potential use cases for SAP application integration, we’ll focus on the three most common scenarios: CRM integration, supplier integration, and integration with thirdparty purchase order systems.

Integration with CRM Applications

This is perhaps the most common SAP integration scenario. Companies often need to synchronize customer data between their ERP and CRM systems, such as SAP and For example, when a salesperson brings a new customer on board, it is important to get that information for financials, performance management, and other business functions handled by SAP. This ensures timely data synchronization between systems and increases overall business agility.

Integration with Supplier Systems

Integrating SAP and supplier systems is another common scenario. For instance, when a purchase requisition for a component or raw material is created in SAP, the request needs to be sent to all possible suppliers. These suppliers then respond to the request with quotes, which are routed back to SAP for further processing. To streamline the process, SAP needs to be integrated with these supplier systems.

Integration with Third-Party Purchase Order Systems

Integration between SAP and third-party purchase order systems is also a common use case. In this scenario, integration with third-party purchase order systems enables new purchase orders to be immediately transmitted and available in SAP.

From manufacturing to marketing, from procurement to product development, from finance to Facebook, the CIO and the CMO should have tremendous insight into their company’s operations, its priorities, its vulnerabilities and its opportunities.

So today, as our legacy systems of record become agile systems of engagement, and as the social revolution opens up all facets of our enterprise to customer interactions as well as customer scrutiny, it is time to eliminate the internally constructed silos that are the primary reason operational pain still exists. This may be a bit far reaching when dealing with change, but shouldn’t we try to engage our customers in product development, service plans and operations, maybe even marketing and pricing options?

So how do we get there? First we must lead by becoming change agents.

With the rise of mobile devices and the growing use of cloud solutions, platforms have exploded, actually increasing the operational pain points in the short run. These new applications must now interact with an ever-increasing number of software platforms — moving the need for integration to the forefront. As cloud computing continues to expand with many predicting a 36-40% compound annual growth in cloud computing through 2016, CIOs must face a growing problem: integrating all these cloud apps and services.

...many predicting a 36-40% compound annual growth in cloud computing through 2016

65-percent of businesses unable to integrateAs any of you that have followed our posts in the past know, we have been on the Big Data soap box for some time now, and while change is in the air we can’t forget to address our Big Data strategy as it relates to focusing on the customer.

We believe that CIOs who choose to sit back and wait for “the business” to tell them what to do will end up reporting to the CMO within a year or two. But companies will fare much better if their CIOs eagerly and rapidly begin framing Big Data challenges and opportunities in terms of customers, opportunities, revenue and business value. As we know, much of the talk about Big Data has obscured the fact that the real issue is enabling intelligent and instantaneous analysis to provide optimal insights for business decisions.

CIOs need to ensure they’re looking at these highvolume, high-velocity challenges in the right way: as business enablers, not technical projects. For example: What if you could perform fraud-detection analytics across all of your transactions in real time, instead of across just a random sampling of only a few percent of all those transactions? What if you could analyze three years of customer data in minutes, rather than only the past three months in hours? In the meantime, we can be certain that the scale and speed of this current challenge will only increase as CIOs must rapidly and seamlessly enhance their traditional corporate data with vast new streams of social and mobile data to realize the full potential of these strategic Big Opportunities.

In summary, while there are other important issues CIOs are facing, integration and Big Data will likely be at the top of their list of issues that impact how their company responds to changing customer needs.

We feel that finding the needle in the haystack is the goal, so ensuring that the data you are analyzing has been modeled and filtered is of the utmost importance.

Reality check

The idea that the combination of predictive algorithms and Big Data will change the world is a tempting one, and it may end up being true. But for now, the industry is facing a reality check when it comes to Big Data analytics.

Instead of focusing on what algorithms to use, your Big Data success depends more on how well you cleaned, integrated and transformed your data.

More important than algorithms

The dirty little secret of Big Data analytics is all the work that goes into prepping and cleaning the data before it can be analyzed. You may have the sharpest data scientists on the planet writing the most advanced algorithms the universe has ever seen, but it won’t amount to a hill of beans if your data set is dirty, incomplete or flawed. That’s why up to 80 percent of the time and effort in Big Data analytic projects is spent on cleaning, integrating and transforming the data.

The dirty little secret of big data analytics is all the work that goes into prepping and cleaning the data before it can be analyzed.

Validate the data

Instead of focusing on algorithms, people ought to be focused on validating the data. Everybody basically has the same algorithms. I would suggest that the Big Data teams have a good grasp on the role Ontology plays in their Big Data initiative, especially when dealing with unstructured data.

In the case of financial markets the new FIBO (Financial Industry Business Ontology) standard should be well understood. Dodd-Frank brings a whole slew of new challenges.

Healthcare — is there any domain that has a broader mix of terminologies that need to be comprehended, or a greater need for system integration?

Over the years, doctors have developed their own specialized languages and lexicons to help them store and communicate general medical knowledge and patient-related information efficiently. The promise of a global standard for electronic health care records is still years away. As we know, medical information systems need to be able to communicate complex and detailed medical data securely and efficiently. This is obviously a difficult task and requires a profound analysis of the structure and the concepts of medical terminologies. While this task sounds daunting, it can be achieved by constructing medical domain ontologies for representing medical terminology systems.

owl ontology web languageThe most significant benefit that ontologies may bring to healthcare systems is their ability to support the indispensable integration of knowledge and data.

Unfortunately, ontologies are not widely used in software engineering today. They are not well understood by the majority of developers. Undergraduate computer science programs don’t usually teach ontology. There is an urgent need to educate a new generation of ontology-savvy healthcare application developers.

We believe there is a growing need to familiarize ourselves with ontology and OWL. Here are the basics:

  • Ontology is about the exact description of things and their relationships.
  • For the web, ontology is about the exact description of web information and relationships between web information.
  • Ontologies are the next emerging generation of database concepts and technology.

OWL stands for Web Ontology Language

  • OWL was designed for processing information.
  • OWL was designed to provide a common way to process the content of web information (either by displaying it in diagrams or making it available in the web processes).
  • OWL was originally developed to be read by computer applications (instead of humans).
  • OWL contains RDF, but OWL is a stronger language with greater machine interpretability which uses RDF language elements.
  • OWL comes with a larger vocabulary than RDF. However, RDF is simpler, and is used as a basis for the bazillion sets of data being added to the web from countries and special interest groups all over the world.

We are hearing more and more about the Internet of Things (IoT).

The rapid growth of the IoT, along with the explosion of Big Data, cloud and mobile applications, is forever changing our information landscape and in realtime. As it continues to grow and connect things, it’s important to emphasize that as more things, people and data become connected, the power of the Internet grows exponentially.

A little history might be in order. Going back some 30 years, there were just 1,000 connections to the Internet throughout the world. Today, with the help of app-centric infrastructure, sensors and mobile devices, there are about 13 billion connections, and this is still just one percent of what’s possible. The economic opportunity to connect the unconnected totals $19 trillion.

In just six years, we expect 50 billion things to be connected to the Internet, which will still be just scratching the surface of what’s possible.

insight 2014 internet connectionsWe know that data is doubling every two years, and according to IDC the digital universe will expand to 44 zettabytes, or 44 trillion gigabytes, annually by 2020. That’s even more staggering when you consider that today, 90 percent of data is dark — it is only viewed once or not at all. This is another reason to filter the data being collected before applying a Big Data tool.

According to a recent survey and study done by Pew Research Internet Project, a large majority of the technology experts and engaged Internet users who responded — 83 percent — agreed with the notion that the Internet/Cloud of Things and embedded and wearable computing will have widespread and beneficial effects by 2025. Cisco created a dynamic a “connections counter” to track the estimated number of connected things from July 2013 until July 2020. This concept, where devices connect to the Internet/ web via low-power radio, is the most active research area in IoT. The low-power radios do not need to use Wi-Fi or Bluetooth. Lower-power and lower-cost alternatives are being explored under the category of Chirp Networks.

This explosion of data and apps — when properly optimized — presents unprecedented opportunities to better manage resources and improve quality of life. By embracing the Internet of Everything (IoE), another term being used, businesses across the globe can lead the way toward a more sustainable world. The question business leaders need to ask themselves is “Will our business be a leader or a follower?” Selecting the right technology partners early could determine the difference.

Taking a look at the Big Data frenzy one should ask the question, how much of Big Data is actually useful.

By applying just a little common sense, we discover that it’s only a small amount.

We have been working with data for over 40 years, and if we go back to pre-internet days we experienced what we called data overload. We discovered then that data itself wasn’t valuable, but only a small slice of that data proved to have a direct impact on actual business decisions. With history in mind, what has really changed in solving the most critical issue is related to finding the data that is actually useful. Well, volume has certainly increased, but what is important to deal with is that much of the growth in volume comes in the form of unstructured data. So we will start with “What is unstructured data?” using the definition from Webopedia.

Data can be designated as unstructured or structured data for classification within an organization. The term unstructured data refers to any data that has no identifiable structure. For example, images, videos, email, documents and text are all considered to be unstructured data within a dataset.

While each individual document may contain its own specific structure or formatting that is based on the software program used to create the data, unstructured data may also be considered “loosely structured data” because the data sources do have a structure but all data within a dataset will not contain the same structure. This is in contrast to a database, for example, which is a common example of structured data.

So looking back in history we are talking about data overload with an added new twist called unstructured data, which represents much of the new volume being generated. We would suggest that companies which bring a combination of strong data analytical expertise along with a good grasp of both industry standards and compliance rules can offer precise filtering solutions that can identify the most valuable data for the user.

Peeling Back the Onion

While there are numerous solutions emerging that address the filtering and analytics of structured data such as Splunk, enterprises collect, index and harness all of the fast-moving machine data generated by applications, servers and devices — physical, virtual and in the cloud.

80-percent data growthIn the case of what Hadoop brings to the table, there are many others that have debated its pluses and minuses and we will leave that topic to them.

The real challenge is to provide costeffective solutions that address the much more complex world of filtering and real-time analytics of unstructured data. Additionally, extracting value from Big Data requires trained experts who understand semantics, statistics, algorithms and analytics. Currently these resources are hard to find. According to a recent McKinsey study, the U.S. is now facing a shortage of talent with the expertise to understand and make decisions around Big Data.

While the volume of all data types is expected to grow 800% in the next five years, 80% of that growth will be unstructured data. We would suggest that companies which possess skills and capabilities that include data modeling, analytics, OCL and ontology have a leg up when it comes to delivering solutions that leverage both structured and unstructured data. As of today, the jury is still out on who will be the players that will offer compelling solutions that address the holy grail of finding the needle in the haystack in the growing world of Big Data.

One approach to consider

No Magic has a solution that will improve the time it takes to react to ever-changing customer needs, provide keener business intelligence, and lay the foundation to more effectively deal with identifying what part of the Big Data hype might be relevant to your need. One of the key elements needed to ensure you are able to analyze the right customer trends is to have an enterprise-wide platform that integrates all of your legacy systems with your newer cloud and mobile applications. We offer end-to-end integration of your operational legacy systems with your newer customer-facing applications that may reside in the public and private cloud.

The Cameo E2E Bridge

The Cameo E2E Bridge provides any enterprise an easy way to integrate legacy systems with new cloud and mobile applications. This platform uses a 100% business model approach that delivers much greater business transparency than alternative methods. Since the platform is 100% model driven, it lays the foundation to model both structured and unstructured data in the exact format that is most relevant to your specific business need. To top it off, when it is time to implement Hadoop, the Cameo E2E Bridge provides a direct interface, providing for a complete end-to-end solution.

Let's make sure we are all starting from the same baseline as it relates to Ontology and its value.

It is important to comprehend that ontology enables knowledge sharing and reuse. In that context, ontology is a specification used to define and formalize industry-specific vocabularies. It has some nice properties for knowledge sharing among AI software (e.g., semantics independent of reader and context). Practically, an ontological commitment is an agreement to use a vocabulary (i.e., ask queries and make assertions) in a way that is consistent (but not complete) with respect to the theory specified by an ontology. Once implemented for a specific industry domain, it can dramatically improve the chances of finding the needle in the haystack as it relates to Big Data initiatives. This discussion then will focus on the financial industry; the umbrella for this activity has been designated the Financial Industry Business Ontology (FIBO). An update on where FIBO stands follows:

Once implemented for a specific industry domain it can dramatically improve the chances of finding the needle in the haystack as it relates to Big Data initiatives.

Financial Industry Business Ontology

The Enterprise Data Management Council (EDM) is the author and steward of the Financial Industry Business Ontology (FIBO). FIBO is a collaborative effort among industry practitioners, semantic technology experts and information scientists to standardize the language used to precisely define the terms, conditions and characteristics of financial instruments; the legal and relationship structure of business entities; the content and time dimensions of market data; and the legal obligations and process aspects of corporate actions.

FIBO is being released as a series of standards under the technical governance of the Object Management Group (OMG), with the direct involvement of EDM Council membership. FIBO is based on the legal structures and obligations contained within the myriad of contracts that form the foundation of the financial industry.

Key FIBO Benefits

FIBO presents knowledge about financial instruments, business entities, market data and corporate actions in a technology-neutral format, along with formal definitions and defined business relationships. FIBO standardizes the language of financial contracts and promotes unambiguous shared meaning among all participants in the financial industry. Use cases and benefit areas include:

Common reference standard for aligning multiple repositories of data: FIBO provides the mechanism for data consistency and comparability. It facilitates data integration, supports business process automation and enables consolidated views across the financial industry.

Standardized language for internal and external communication: FIBO is a business language and facilitates precise communication (in place of ad hoc spreadsheets and technical models) about data requirements between business stakeholders and IT developers of applications, message schemas, etc.

While we may look enviously on the success stories, the challenges of actually implementing a Big Data strategy are many.

TThe Financial Services industry thrives on a whole host of data: transaction data, customer data, market data and social media data, among others. As computing power has evolved to be able to process greater quantities and types of data at greater speeds than ever before, many financial services organizations have been waking up to the potential opportunities.

Over 70 percent of banking and financial markets firms say that information and analytics is creating a competitive advantage for their organizations, according to a recent study from IBM Institute for Business Value in collaboration with Said Business School at the University of Oxford.


The potential use cases within financial services are huge, especially as the market moves from "product centric" to "client centric." For instance, banks are already using Big Data for fraud detection (and prevention), customer insight, regulatory compliance and reporting, and improving operations.

However, while we may look enviously on the success stories, the challenges of actually implementing a Big Data strategy are many.

So what's holding financial institutions back from realizing the potential of Big Data?

1. Lack of Leadership Support

Senior management are often not convinced completely about the benefits of Big Data, and thus are not willing to invest in/build robust tool(s) to deliver integrated solutions. As with all initiatives, you really need to demonstrate the benefits that investing in one particular solution over another would yield. When it comes right down to it, hard numbers is what will speak to your senior leadership teams.

2. Company Culture

Internal politics, performance review structures, and organizational structures can impede the progress of Big Data as business units and individual managers may feel uncomfortable with the open sharing of data. While it's impossible to change company culture overnight, you need to be aware that some people might not be as excited about a more open sharing of data, and look at how you can work with Human Resources to ensure that performance schemes are aligned with the brave new world of open data you're going to be introducing to your business.

3. Lack of Integration (data is distributed in legacy systems throughout the organization)

Fragmented business processes and distributed data (e.g., data in legacy IT systems) creates big challenges for Big Data. Many banks, for instance, hold large reservoirs of important data in their legacy systems, and these systems do not work easily with systems like Hadoop. This is not a challenge that can be easily tackled; it requires time and investment to extract 0the data out of these old systems and put it in a useable form, and this needs to be factored into your Big Data initiatives.

4. Regulatory Requirements

After the 2008 economic meltdown, there has been an increase of new, strict regulations (e.g., Dodd Frank, Basel III, FATCA) resulting in complex rules governing access to critical client data. Thus it has become difficult to negotiate the maze of regulations around data access. Time to get your legal department involved.

5. Business Silos and Modules

Financial institutions, banks and insurance companies are mature organizations that have often gone through multiple acquisitions over the years. This creates very silo'ed business modules and hidden barriers to Big Data. These structures, like legacy systems, are likely to slow your implementation down and should be factored into your plans.

6. Data Security

Protecting data from security risks is a massive concern for financial services organizations and needs to be built into any Big Data projects, adding an additional layer of complexity.

7. Data Quality

Poor data quality can result in defective analysis. If managers do not trust the underlying data, they will not trust the resulting analysis. There are many steps organizations can take to improve the quality of both structured and unstructured data being used in analytical systems. For unstructured data, for instance, organizations can use ontologies with end-user inputs, semantic libraries and taxonomies, while for structured data it's crucial to emphasize the ongoing importance of inputting the data correctly.

8. Lack of Talent/Experience

Because this is a relatively new and emerging area, there is a scarcity of people with the required analytical, technical and business skills to generate business results from Big Data. Equally, there is a dearth of people with prior professional experience actually implementing Big Data systems. Clearly, as universities catch up with changing business needs and more companies embark on Big Data strategy, both of these limitations will gradually be overcome naturally.

You can find out more about a proposed model to bring together a more integrated business architecture to support a Big Data implementation in my whitepaper Capitalizing on Big Data in Financial Services through Integration & Optimization.

But what do you think? Have you encountered any of these challenges in your Big Data implementations? Which one posed the greatest barrier?

When it comes to definition and the description of Ontology we have referenced content from Wikipedia (Ontology).

Ontology applies the power, simplicity and speed of semantic search to gain insight into all enterprise application data replacing traditional data integration. This means the ability to search and link applications, databases, files, spreadsheets etc. anywhere, without the cost and risk of integration.

In the same way that Google made it possible to find any "string" in the Internet via text search, Ontology makes it possible to find any "thing" across enterprise data and applications via just-enough semantic modeling and graph-search. This means the ability to search and link core applications, databases, big data sources, files, spread sheets, documents, emails etc. anywhere, without the cost and risk of integration.


Ontology systems are revolutionizing how companies use their applications and data. The Internet and the Internet of Things continue to create a strong demand for sharing the semantics of data. Ontologies are becoming increasingly essential for nearly all data-rich applications. Companies are looking toward them as vital machine-processable semantic resources for many application areas. The reason they are important to understand and grasp is they can all but eliminate the difficult task of integrating all of your systems. By sharing an ontology, autonomous and distributed applications can meaningfully communicate to exchange data and thus make transactions interoperate independently of their internal technologies.

However, some confusion on how to reuse these techniques is evident. For example, many have confused ontologies with data schemes, knowledge bases or even logic programs. Unlike a conceptual data schema or a "classical" knowledge base that captures semantics for a given enterprise application, the main and fundamental advantage of an ontology is that it captures domain knowledge highly independently of any particular application or task.

Eliminating the need to Integrate

insight over 80per failWe have discussed how traditional data integration is difficult. To reemphasize that point, let's reflect. Corporations worldwide spend vast sums of money and time trying to get usable knowledge by integrating the often-conflicting, data spread across their many applications.

Unfortunately, dataintegration has been necessary whenever initiatives like customer care dashboards, compliance or financial reporting, IT consolidation, data migration and business intelligence are contemplated. The high cost, difficulty and risk of data integration are uncontroversial, as numerous research firms have noted that over 80% of data migrations fail.

Before Internet search engines, such as Google and Bing, you needed to be told the address of each website. The ability to search the unstructured web for strings and phrases ("string") revolutionized the Internet. Nobody integrated the Internet.

The Impact of Search

Before enterprise search engines and wikis, one needed to know the name and full location (the server, folder, filename) of the document being searched. This took time. Enterprise-wide stringsearch of unstructured documents and emails revolutionized IT document management by making it easy for anybody, even customers, to get results. No one needs integration to find enterprise documents.

So — what about all of the structured data that exists across enterprise applications? You simply cannot search applications using strings. Applications are not documents; they contain data that represents "things."

So until now, one needed data integration to figure out, for example, which ten applications reference the same customer, and which five systems define the services and infrastructure that they depend upon. This meant never-ending master data management projects, CRM consolidations and Customer 360 integrations. All expensive, and often delivered late.

Ontology solves this problem through one simple insight: As Google has indexed every website on the Internet via "strings" to deliver the simplicity of search, so can we index every enterprise application via semantics or "things" to deliver far simpler integration.

For more information or a quote
please contact
or call +1-214-291-9100.