Texai Project
Texai is a knowledge-based software project to create artificial intelligence.
Revised July 1, 2009
Introduction
Texai is an knowledge-based, open source project to create artificial intelligence. The project’s approach is to first construct a bootstrap English dialog system whose goals are to acquire linguistic and common sense skills to improve its own performance. Next the system will acquire expertise in algorithms, and in Java programming for the purpose of explicitly representing its own behavior in the knowledge base (KB). Thus it will understand, revise, test and automatically compose its own source code. In parallel at this point, the system will acquire lexical and common sense knowledge from the glosses (word sense definitions) in the Texai lexicon, and begin to covert Wikipedia English text into KB statements, fleshing out the OpenCyc terms. In addition to scaling to many disparate users via Jabber chat from a single Texai instance, the system will be deployed as a virtual appliance to compute clusters and to a multitude of Internet users, where each instance hosts one or more nodes organized within an Albus Hierarchical Control System. These Albus Nodes (i.e. agents) will be organized into agencies, many mirroring current human organizations in which a node is a user’s proxy into the Albus hierarchy for some role. The artificial intelligence will then consist of a vast community of organizations whose members are Albus nodes, each quite intelligent with regard to its agency’s mission.
Initial Deployment Plan
The initial deployment of the English bootstrap dialog system was planned for June 23, 2009, which is Alan Turing’s birthday. But the release will be delayed for a few more days as described in this status post about the remaining usability issues. Texai will communicate, as an online chatbot, with volunteer mentors to acquire word sense meanings. Usage during the remainder of 2009 will confirm or deny the hypothesis that Texai will be able to figure out for itself a substantial portion of the WordNet word senses from their text definitions, after having learned the most frequently occurring definition word senses.
What is Available Now
The project’s knowledge base is stored in the Sesame RDF server. Because the initial knowledge base is large, it has been partitioned into separate Sesame repositories. These have been extracted into RDF and have been released in the file download section here. The project’s domain objects are persisted in Sesame using the RDF Entity Manager and semantic annotations. The RDF Entity Manager is released as a separate component. See the the file download section here.
News
Click on the Home link above for the latest blog posts.
3 Responses to “Texai Project”
Immediate updates on Twitter

Stefano Bertolo on 25 May 2009 at 6:55 am #
hi Steve,
can you give a small example of:
i) an RDF representation of a Java class + methods;
ii) an English sentence that would describe a modification to said class/methods
iii) the RDF translation of said sentence
iv) the mechanism that would integrate i) and iii) to yield a correct RDF representation of the new desired functionality for the class + methods?
thanks in advance,
stefano
Steve Reed on 25 May 2009 at 8:52 pm #
Hi Stefano,
I made your great question into a blog post: Java Programming Via Dialog.
-Steve
Steve Reed on 12 Jun 2009 at 7:48 am #
On the AGI-list, Matt Mahoney said:
I entirely agree with Matt’s comment above. The notion of bootstrapping in the Texai English dialog system is to learn the meanings of the most frequently occurring words in the definitions of its yet-to-be-learned vocabulary, and then by reading their definitions, learn the meanings of the remaining words with help from a multitude of volunteer mentors.
In particular Matt said:
An analysis of the word usage frequency in the Texai vocabulary definitions reveals that knowing perhaps only 10000 frequently occurring words should be enough to understand half of the whole lexicon of 85000 English words.
I acknowledge that there must be a very expensive process of encoding knowledge explicitly. Like Cycorp’s initial approach for DARPA’s Rapid Knowledge Formation project, for which I was the first project manager, Texai will use English dialog to rapidly acquire knowledge. I hypothesize that such dialog greatly reduces the expense of teaching new facts to the system, and also permits a vast multitude of volunteer mentors to divide the effort: “many hands make light work”.