SCIENCE & ENGINEERING NEWS
Kirkland, WASH. — Rosetta Inpharmatics, Inc. announced the availability of a new file format, called Gene Expression Markup Language (GEML), to facilitate the interchange of data from DNA chip and other gene expression technologies into a consistent format for more efficient and extensive data analysis. GEML was developed by Rosetta Inpharmatics in an effort to accommodate the growing market need for researchers to have a consistent gene expression data format to use in analyzing data from the growing number of available gene expression technologies. Detailed information on the benefits of GEML and how to use the GEML format is available at http://www.rii.com/geml .
“The anticipated value of gene expression data is widely acknowledged in the life science market. However, broad acceptance and use of this technology has been complicated and delayed by the need for a consistent format with which to integrate data from multiple technologies, all of which generate different data formats,” stated Mark Boguski, M.D., Ph.D., Senior Vice President of Research and Development for Rosetta Inpharmatics. “Rosetta’s goal in designing the GEML format was to minimize these complexities. The GEML format is available now and solves this issue by enabling researchers to compare data derived from a variety of gene expression platforms. This data format also supports Rosetta Inpharmatics’ goal of helping companies expedite drug discovery and development efforts.”
An important potential utility of the GEML format is in reading molecular signatures of cancer and other diseases through various technology platforms. The National Cancer Institutes Director’s Challenge Program, focused toward developing molecular classifications of cancers, will explore the utility of the GEML format to expedite attainment of this goal. “We are very pleased that Rosetta is making GEML accessible to the NCI Director’s Challenge program. We look forward to working as partners with Rosetta toward seamless integration of complex data sets in order to find the most informative signatures of cancer for prevention, early detection, diagnosis and intervention research” said Richard Klausner, M.D., NCI Director.
GEML is an extensible markup language (XML-based) data file format that operates independently of any specific database schema. GEML supports data from various gene expression platforms including Affymetrix, Agilent, Incyte, Molecular Dynamics, 1-color or 2-color fluorescent labeling and scanning, nylon filters, and other data formats. GEML also tracks which data format was used, and with this knowledge enables normalization, integration, and comparison of data across technologies.
As part of Rosetta’s efforts to encourage this potential industry standard for gene expression data, Rosetta intends to offer users of GEML additional tools for viewing and analysis of genomic data in GEML format. This will enable genomic researchers to publish and analyze data using any tools that support the GEML format. Rosetta intends to pursue future enhancements with input from collaborators in the GEML community, and intends to work with its GEML collaborators to propose this data format as a standard to the Object Management Group (OMG) in November.
Rosetta Inpharmatics offers solutions to the emerging field of informational genomics. By combining the power of informatics and genomics, Rosetta has created a proprietary platform that accelerates and enhances the drug discovery process for pharmaceutical and biotechnology companies, and improves agricultural products. Rosetta Inpharmatics’ approach converts the rapidly growing amount of gene expression data, or information about a gene’s activity, into organized, statistically driven, information-based results. Information about Rosetta Inpharmatics can be found on the Web at http://www.rii.com .