AllegroGraph® RDFStore 4.0

AllegroGraph RDFStore is a modern, high-performance, persistent RDF graph database. AllegroGraph uses disk-based storage, enabling it to scale to billions of triples while maintaining superior performance. AllegroGraph supports SPARQL, RDFS++, and Prolog reasoning from numerous client applications.

AllegroGraph New V4.0 Features
  • AllegroGraph is 100 percent ACID, supporting Transactions: Commit, Rollback, and Checkpointing. See the new tutorials for the Java and Python clients
  • Full and Fast Recoverability
  • 100% Read Concurrency, Near Full Write Concurrency
  • Online Backups
  • Dynamic and Automatic Indexing – All committed triples are always indexed (7 indices)
  • Advanced Text Indexing – Lucene style but faster, text indexing per predicate. See the new tutorials for the Java and Python clients
  • Duplicate Triple deletion while indexing
  • All Clients based on http REST ProtocolJava, Sesame, Jena, and Python
  • Completely multi-processing based (SMP) – Automatic Resource Management for all processors and disks, and optimized memory use. See the new performance tuning guide here, and new server configuration guide here
  • Column-based compression of indices similar to column-based RDBMS – reduced paging, better performance
  • Dedicated and Public Sessions – In dedicated sessions users can work with their own rule sets against the same database
  • Python Client Improvements – We now provide a full Python interface. The API is based on the Java Sesame interface and includes Spatial-Temporal and Social Network support
  • LUBM Benchmarks – Updated for this release
  • Mark Watson's new book: Practical Semantic Web and Linked Data Applications, Java, Clojure, Scala, and JRuby Edition

The primary emphasis of AllegroGraph version 4 development has been to bring AllegroGraph triple store technology to the Enterprise, for semantic application deployment. Many other improvements have been made as well. The transition from AllegroGraph 3.x to 4.x involved many small changes to the AllegroGraph APIs, and several larger ones. Please see here for code conversion notes.

High-performance Storage

AllegroGraph is designed for maximum loading speed and query speed. (See here for LUBM query results.) Loading of triples, through its highly optimized RDF/XML, N-Quads, and N-Triples parsers, is best-of-breed, particularly with large files. The AllegroGraph product line has always pushed the performance envelope starting with version 1.0 in 2004, which was the first product to claim 1 billion triples loaded and indexed using standard x86 64-bit hardware. AllegroGraph, a purpose built (not a modified RDBMS), RDF Quad Store continued to drive innovation in the marketplace with the 2008 SemTech conference example of 10 billion triples loaded on Amazon’s EC2 service. The new 4.0 release continues to bring performance to the forefront of Franz’s Semantic Technologies as the industry's first OLTP semantic web database. AllegroGraph 4.0’s ability to automatically manage all available hardware resources to maximize loading, indexing and query capabilities once again raises the bar for RDF storage performance. The following table displays an example of AllegroGraph's performance in loading.

Load Test
# Triples
Time
Load Rate *
LUBM 8000
1.106 Billion
2 hours, 37 minutes, and 46.222 seconds
120.66 K/S

* Load Rate = Thousand triples per second

The platform for the test was 2 - 6 core AMD Opteron Processors, 2439 SE 2.8 GHz, with 64 GB RAM, running Fedora 10.

AllegroGraph RDFStore Architecture

The primary interface to AllegroGraph is a REST protocol architecture, essentially a superset of the Sesame HTTP Client. Franz’s staff directly supports adapters for various languages, Sesame Java, Sesame Jena, Python using the Sesame signatures, and Lisp. There are Open Source Adapters through community projects for C#, Ruby, Clojure, Scala, and Perl. Links to download the clients are here.

Powerful and Expressive Reasoning and Querying

AllegroGraph provides the broadest array of mechanisms to query and access knowledge in an RDF datastore:

  • RDFS++ Reasoning - Dynamic Materialization

    Description logics or OWL-DL reasoners are good at handling complex ontologies. They tend to be complete (give all the possible answers to a query) but can be totally unpredictable with respect to execution time when the number of triples increases beyond millions. AllegroGraph offers a less complete but very fast and practical RDFS++ reasoner. We support all the RDF and RDFS predicates and some in full OWL. The supported predicates are rdf:type, rdfs:subClassOf, rdfs:range, rdfs:domain, rdfs:subpropertyof, owl:sameAs, owl:inverseOf, owl:TransitiveProperty, and owl:hasValue.

    AllegroGraph's RDFS++ engine dynamically maintains the ontological entailments required for reasoning: it has no explicit materialization phase. Materialization is the pre-computation and storage of inferred triples so that future queries run more efficiently. The central problem with materialization is its maintenance: changes to the triple-store's ontology or facts usually change the set of inferred triples. In static materialization, any change in the store requires complete re-processing before new queries can run. AllegroGraph's Dynamic Materialization simplifies store maintenance and reduces the time required between data changes and querying.

  • SPARQL Queries on Named Graphs

    SPARQL, the W3C standard RDF query language, returns RDF, XML and other formats in responses to queries. AllegroGraph's SPARQL, one of the W3C's "interoperable implementations", includes a query optimizer, and has full support for named graphs. It can be used with the RDFS++ reasoning turned on (i.e., query over real and inferred triples). SPARQL can be used with every available AllegroGraph interface mentioned in the previous section.

  • Prolog

    AllegroGraph's RDF Prolog provides concise, powerful, industry-standard, domain-specific reasoning to build high-level concepts (that require complex rules or numerical processing) on top of RDF data. AllegroGraph Prolog is an option because many use cases are difficult (or very cumbersome) to model with only RDF/RDFS and OWL. Prolog can also be used on top of the RDFS++ reasoner as a rule based system.

  • Low-level APIs Allow fast, 'close-to-the-metal' access to triples by subject, predicate, and object.
Additional Features

Other essential Triple-Store features:

  • Geospatial and Temporal Reasoning

    AllegroGraph stores geospatial and temporal data types as native data structures. Combined with its indexing and range query mechanisms, AllegroGraph lets you perform geospatial and temporal reasoning efficiently.

  • Social Networking Analysis

    AllegroGraph includes an SNA library that treats a triple-store as a graph of relations, with functions for measuring importance and centrality as well as several families of search functions. Example algorithms are nodal-degree, nodal-neighbors, ego-group, graph-density, actor-degree-centrality, group-degree-centrality, actor-closeness-centrality, group-closeness-centrality, actor betweenness-centrality, group-betweenness-centrality, page-rank-centrality, and cliques. Geospatial and temporal primitives combined with SNA functions form an Activity Recognition framework for flexibly analyzing networks and events in large volumes of structured and unstructured data.

  • Native Data Types and Efficient Range Queries

    AllegroGraph stores a wide range of data types directly in its low level triple representation. This allows for very efficient range queries and significant reduction in triple-store data size. With other triple-stores that only store strings, the only way to do a range query is to go through all the values for a particular predicate. This works well if everything fits in memory; but if the predicate has millions of triples, it will need costly machines with huge amounts of RAM. AllegroGraph supports most XML Schema types (native numeric types, dates, times, longitudes, latitudes, durations and telephone numbers).

  • Free-text Indexing

    AllegroGraph supports free-text indexing on the objects of triples whose predicates have been registered for indexing. Once indexed, triples can be found using a simple but robust query language. Free-text indexing support includes functions to register predicates and see which predicates are registered.

  • Named Graphs for Weights, Trust Factors, Provenance

    AllegroGraph actually stores quints. A triple in AllegroGraph contains 5 slots, the first three being subject (s), predicate (p), and object (o). The remaining two are a named-graph slot (g) and a unique id assigned by AllegroGraph. The id slot is used for internal administrative purposes, but can also be referred to by other triples directly.

    The W3C proposal is to use the 'named-graph' slot for clustering triples. So for example, you load a file with triples into AllegroGraph and you use the filename as the named-graph. This way, if there are changes to the triple file, you just update those triples in the named graph that came from the original file. However, with AllegroGraph, you can also put other attributes such as weights, trust factors, times, latitudes, longitudes, etc, into the named graph slot.

  • Direct Reification

    AllegroGraph allows triple-ids to be the subject or object of another triple. This is beyond the scope of pure RDF. The advantage of this approach is that you can reduce the total number of triples in the store to a more manageable size, and, even more importantly, dramatically reduce query time because a single query can retrieve more data.

  • Automatic Resource Management

    The AllegroGraph architecture is designed to maximize hardware resources for all data management procedures (Loading, Indexing, Query, etc.). The hardware utilization can be managed through the AllegroGraph configuration file as necessary.

  • Dynamic and Automatic Indexing

    Index management is now taken care of entirely by AllegroGraph, you don't have to think about it anymore. All committed triples are always indexed (7 indices). The indices are:

    • S, P, O, G, I - Subject, Predicate, Object, Named Graph, ID
    • P, O, S, G, I
    • O, S, P, G, I
    • G, S, P, O, I
    • G, P, O, S, I
    • G, O, S, P, I
    • I
  • Federation

    AllegroGraph supports queries with distributed databases. You can group multiple triple-stores, both local and remote into a single virtual store. It allows thread-safe opening of multiple triple-databases from one application (for the read only parts of the database). Queries over multiple databases are easy with direct data access from applications. It also supports physical merging of databases.

Professional Services

Make the most of your use of semantic technologies by utilizing our consulting services.

We provide:

  • Help migrate data from RDBMS or CSV files into AllegroGraph
  • Pilot and Evaluations - with Semantic Technologies in general
  • Migration Assessment - moving to ontology-based systems
  • Milestone Review - we can help verify and reality-check your project
  • Performance Analysis - getting the most out of AllegroGraph
  • Deployment Options - advise on deployment options
  • Application-specific coding

Contact a Franz Product Applications Manager for information about getting started today, at 1-888-256-7669, ext. 300; outside of Canada and the US call +1-510-452-2000, ext. 300 or email: info@franz.com

Compatible Semantic Technologies

  • TopBraid Composer

    TopBraid Composer, developed by TopQuadrant, Inc., is an enterprise-class modeling and application development environment It provides comprehensive support for modeling ontologies and data, connecting data sources, designing queries, rules and semantic data processing chains, and developing Semantic Web applications. For details see TopBraid Composer

  • RacerPro

    The Semantic Web reasoning system developed by Racer Systems GmbH, RacerPro, has been integrated with AllegroGraph, exposing RDF data in AllegroGraph to Racer's highly optimized Description Logic (DL) reasoner. It is most suitable for ontology-driven applications or theorem proofing. RacerPro's interfaces also include DIG over HTTP and support for rules (SWRL). For details see RacerPro

  • AGWebview

    AGWebview, developed by Franz, Inc., is an interface for exploring, querying, and managing AllegroGraph triple stores through a web browser. For details see AGWebview

  • Gruff

    Gruff, developed by Franz, Inc. is a triple-store browser that displays visual graphs of subsets of a store's resources and their links. By selecting particular resources and predicates, you can build a visual graph that displays a variety of the relationships in a triple-store. Gruff can also display tables of all properties of selected resources or generate tables with SPARQL queries, and resources in the tables can be added to the visual graph. For details see Gruff

  • Pepito

    Data mining has increasingly played a key role in the enterprise decision process because of today's competitive necessity to respond to changing market conditions quickly and correctly, leveraging the enormous operating data now available for such process. PEPITo, developed by PEPITe S.A. brings unique capabilities to meet today's data mining needs. For details see Pepito

  • Cogito

    The COGITO platform by Expert System S.p.A., conceived to bring intelligence to the search, extraction and classification of unstructured information for internal management purposes and for monitoring and analyzing external sources, such as the Internet. For details see Cogito

  • Sentient Suite

    The Sentient Suite, developed by IO Informatics Inc., integrates heterogeneous data to solve knowledge and project management problems for the Life Sciences industry. For details see Sentient Suite

System Requirements

The AllegroGraph 4.0 Server runs natively on Linux x86-64 bit. To run AllegroGraph 4.0 on other operating systems (i.e. Windows, Mac) we suggest you set up a Linux Virtual Machine for non-performance experiments. We provide a Virtual Machine image to help facilitate this installation or you can create one on your own. Clients to an AllegroGraph server may be either 32-bit or 64-bit. The AllegroGraph 4.0 Virtual Machine can be downloaded from the AllegroGraph download page.

64-Bit Virtual Machine Appliance (Requires 64-bit hardware)
Linux (x86-64), glibc 2.4 Apple Mac OSX (x86-64) 10.6
Amazon EC2 (Linux x86-64) 32-bit Microsoft Windows XP/Vista/7/Server 2003
64-bit Microsoft Windows 2000/XP/Vista/7/Server 2003
Coming Soon:
Apple Mac OSX (x86-64) 10.6
64-bit Microsoft Windows 2000/XP/Vista/7/Server 2003
Sun Solaris (x86-64) 2.10

The VMware Appliance will let you run the AllegroGraph Linux version on a Windows or Mac operating system. Performance will be slower than running natively, so we encourage you to install AllegroGraph natively for performance evaluation.

For native 64-bit Mac, Windows, and Solaris, and all 32-bit systems, please use AllegroGraph 3.3.

Installation details for the Virtual Machine Appliance.

 

For more info, send email to AllegroGraph@franz.com or call (510) 452-2000, Option 3-Sales and Marketing.

Copyright © 2010 Franz Inc., All Rights Reserved | Privacy Statement