BioLingua - An Integrated System for Biological Discovery Andrew Pohorille (SSX) We develop an interactive, web-based programming environment, which we call BioLingua. It allows biologists to analyze biological systems and to perform state-of-the-art biological computations. BioLingua enables scientists to combine their data, knowledge, computational tools and previous, relevant discoveries, share them with other members of the community and incrementally upgrade the shared knowledge as a distributed community scientific project. Both data and knowledge might take different forms, such as genomic, proteomic, metabolic and gene expression data, as well as representations of metabolic and regulatory pathways, but are always accessible through a common framework. This information is further combined to conduct complex genomic and proteomic analyses, discover gene regulatory networks, predict how gene regulation is affected by diseases or by changes in the environment and, in turn, connect changes in gene regulation with corresponding changes in metabolism or, more generally, in phenotype. The tools of BioLingua are very useful for astrobiology and space biology, especially for understanding behavior of organisms in extreme environment, acclimation and adaptation of living systems to conditions in space and interactions between organisms in communities. Although we already have a working prototype of BioLingua, a considerable development is still required, especially in for building causal models of biological processes. This is associated with several problems for computer science, such as: 1. (most likely) Bayesian methods. How to combine highly heterogeneous prior knowledge and data in the most effective way? 2. Search. How to perform effectively constrained search in the space of models? 3. Representation. How to extract and represent constraints and data from different sources? 4. Visualization. How to represent the resulting models such that they can be comprehended by biologists?