Technology Matters at Alfresco
We have tended to downplay the technical innovation in marketing Alfresco and emphasize cost, ease of use and the benefits of open source. In an era of consolidation and commoditization, the marketing of technology doesn’t matter as much as ease, convenience and performance. However, we were asked by a major trade journal the following question. I thought I would elaborate the answer and post it to the blog. It reminds us of how far we have come and what a difference a new architecture and a clean slate can make.
How did the technology you used contribute Alfresco and why was it important?
The Enterprise Content Management industry has not innovated in the last several years as the major vendors try to integrate acquired technologies and repositories into their respective stacks. Alfresco uses technology innovation to meet the full functionality of ECM, adapt faster to new standards and customer requirements and is easier to use. With the benefit of 15 years hindsight from the co-founder of Documentum, Alfresco has a clear vision of a full ECM suite with document, records, image and web content management and has built the system using production-ready open source tools in the span of only two years.
Alfresco was built using the Spring open source application development framework and Aspect Oriented Programming, developed at Xerox PARC to address reuse of code with that “plugs-in” into systems and objects. Using the Spring framework and AOP, Alfresco provides simple hook points to add new functions, features and rules into the repository when applications perform actions such as access a document, save content or move or update information. Using these modular AOP plug points, it is easy to add services like authentication, permissions, transformation, versioning, or retention control. This makes the system much easier to extend without having to rebuild the system and also future-proofs the architecture. It also makes for a much faster repository because components or metadata that are not necessary for an application can just be unplugged.
Alfresco has evolved very rapidly by plugging in dozens of other full function, product-ready open source projects using the Spring Framework and not reinvent the wheel. The Hibernate Object-Relational mapping system enables Alfresco to create a model-driven architecture with configuration rather than programming and hides the complexities of the underlying database. The Lucene full-text search engine indexes content and metadata that potentially scales much larger than pure database solutions. The jBPM full-featured business process engine provides simple, modular business processes. The Java-based, Rhino JavaScript and FreeMarker templating scripting engines provide lightweight programming and user interface extensions that are robust and scalable.
The Alfresco repository makes ECM easier and gets users off uncontrolled shared file drives by emulating shared drives and controlling content with rules. The CIFS (Common Internet File System), used by Microsoft shared file drives allows all Windows-based applications to access the repository directly, displays additional metadata and thumbnails in Windows Explorer, provides drag and drop from other windows, and allows users to synchronize their content offline. Below CIFS, user-definable rules and scripts process new content, extract metadata, classify content, move content, apply a workflow or retention policy, or render the content in web ready formats. Users can then search the content based upon Google-like searches or constrained by metadata and can aggregate multiple repositories using the OpenSearch protocol.
By conforming to standards, Alfresco ensures that applications built against the system can be migrated and are future-proofed. As a 100% Java system portable to dozens of different systems, Alfresco is often one of the first ECM systems to implement standards such as CIFS, OpenSearch, Web Services and JSR-170 Java Content Repository standard interface. Alfresco exposes its user interface as a standard JSR-168 portlets and through the ubiquitous Tomcat application server. Alfresco has integrated the standard JavaScript language for server side scripting and is extending new Web 2.0 types of protocols such as REST-style interfaces based upon JavaScript, OpenSearch, RSS, and ATOM. Alfresco is now adding a standard SQL interface and the OpenID single sign-on protocol.
The Alfresco system would not be as full-featured nor as adaptable to requirements without a strong technology base and the commitment to use open source components.





You mentioned authentication, how about the ease of adding in externalizable authorization via XACML?
Posted by: James | March 01, 2007 at 11:49 AM
I'll refer back to the post I made back in March of last year. http://newton.typepad.com/content/2006/03/ecm_answers_for.html
"Directory systems are orthogonal to XACML, even though they shouldn’t be. XACML has been designed to provide access control mechanisms to services that do not have their own security control, like new web services. The problem with ECM in using XACML is that their security models are implemented at a lower level and with semantics that may not be as broad as the capabilities of XACML. In addition, XACML does not understand the semantics of the ECM system such as hierarchical structures, roles and mappings for security that can require complex caching schemes to quickly validate content access. We have just been going through an exercise in AIIM iECM to map security services along with IBM and Documentum where we have tried to rationalize these two approaches."
Applying XACML to individual resources (documents or web content) seems an overly complex way of relating XACML to ECM. Defining a role-based interface might work. Also, defining a collection upon which XACML can be applied in relation to operations on items in that collection might also work. I'm happy to hear any proposals you might have.
In the meantime, the standards groups have completely given up on security in general as too hard to rationalize. JSR-283 has settled on a model of applying general policies on content from within the ECM without defining what a policy is. Not quite enough to implement XACML which works by applying the policy outside of the ECM. Will think about this one some more.
Posted by: John Newton | March 01, 2007 at 02:58 PM
You are correct in that using XACML to apply security to individual documents would be evil. That being said, security should be applied based on role which not only provides a cleaner ECM model but also allows XACML to be cleanly implemented.
One general observation is that the ECM community at large seems to be stuck in the past relative to other domains and have convinced themselves that they are the only ones dealing with millions of elements and the need for sophistacted caching. How can this mindset be broken?
Posted by: James | March 05, 2007 at 11:33 AM