GBF Series' articles
Recent Submissions
-
Marktdaten zur Biotechnologie - Produkte und ReaktorenAufgrund der hohen Wachstumsraten, die vielfach fiir den Biotechnologiemarkt prognostiziert werden, und gefördert durch das Biotechnologie-Programm der Bundesregierung, in dem die Entwicklung der Bioverfahrenstechnik einen hohen Stellenwert einnimmt, kann man heute beobachten, daß zahlreiche Anlagen- und Apparatebaufirmen erwägen, sich auf dem Gebiet der Biotechnologie zu betätigen und zu investieren, um so langfristig an diesem Markt teilzunehmen. Auch der Verband Deutscher Maschinen- und Anlagenbau eV (VDMA) ist sich der wachsenden Bedeutung der Biotechnologie bewußt und hat 1986 und 1987 Workshops zu diesem Problemkreis organisiert. Allerdings differieren die Erwartungen und Prognosen über den zukünftigen Biotechnologiemarkt außerordentlich stark. Selbst die Erfassung des derzeitigen Marktvolumens ist schwierig. Die Diskrepanzen ergeben sich teilweise durch unklare Definitionen darüber, welche Produkte der Biotechnologie zugeordnet werden. Bei der Abschätzung des Potentials der neuen Biotechnologieprodukte ist besonders deren Markteintritt unsicher. Viele Prognosen sind spekulativ und nicht nachvollziehbar. "Bioprognostik" stellt zur Zeit einen eigenen Markt mit nicht unbeträchtlichem Marktvolumen dar. Nach einem Bericht von N. Rau in BTF-Biotech-Forum 3 (1986) 121 sind über 300 Marktstudien, -analysen und -forschungsberichte über den Bereich der Biotechnologie erhältlich, deren Preis sich aufsummiert auf etwa 2 Mio DM beläuft. Für einzelne Studien sind bis zu 35.000 US $ zu zahlen, Seitenpreise bis zu 100 DM sind keine Seltenheit. Die Vielzahl der Prognosen allein schon verdeutlicht die Verwirrung. Mit der Herausgabe dieses Hefts bemüht sich die GBF, die Diskussion zu versachlichen, indem in Teil I - selbstverständlich ohne Anspruch auf Vollständigkeit - Daten aus Marktstudien und Analysen zusammengestellt wurden, auf deren Basis einige, wie wir meinen, konservative Schlußfolgerungen über den erwarteten Bioboom gezogen werden können. Im Teil II der vorliegenden Studie wird nur der Bioreaktormarkt betrachtet, da viele Anlagen- und Apparatebauer, besonders die, die für die Pharma-, Lebens- und Genußmittelindustrie tätig sind, hier einen günstigen Einstieg in einen lukrativen Markt vermuten.
-
Statistical Analysis of DNA SequencesWereport recent results of statistical analysis of DNA sequencesin this paper. First, we show howrepetitive segments reduce the information content measured by Shannon entropies. Then, the effect of a nonuniform codon usage on the mutual information function is studied analytically. For this purpose, the concept of pseudo-exons is introduced. Finally, we discuss the modularity of DNA which can serve as a sensitive tool to distinguish DNA sequences from randomstrings.
-
An Integrated Services Approach to Biological Sequence DatabasesDatabase users in molecular biology are faced with steadily increasing amounts of raw data, multiple database providers and services. Here we describe the integration of a set of previously isolated database services and demonstrate their accessibility through a uniform userinterface. A multi-layered software architecture is applied to make different degrees of service integration transparent to the user. We focus on the design of specialized gateways that integrate services differing in temporal behavior and stateless or state dependent operation. Gateways may reside on heterogeneousplatforms. A link layeris introducedto integrate individual query functions in order to interrelate simple, complex and state dependent services through a common, unique interface. It is possible to generate new complex services by a combination of multiple functions. Wedescribe the application of the World Wide Web (WWW)as the implementation framework of the interface layer. To assure interoperability of services, integrity of data resources must be supervised. Consistency control is issued by a dedicated synchronization layer.
-
Algebraic Methods for the Analysis of Redundancy and Identifiability in Metabolic 13C-Labelling SystemsStationary !?C tracer experiments supply a large amount of information related to metabolic fluxes in microorganisms. The unknownintracellular fluxes can be determined from some directly measured metabolic fluxes and the fractional labelling of intracellular carbon atom pools. Metabolic ?C-Labelling Systems are modelled by large algebraic equation systems with respect to fluxes and fractional labels. Identifiability of the unknownintracellular fluxes and redundancy of measured quantities are of great importance for the design and evaluation of experiments. This contribution presents algebraic methods to treat these problemsa priori and a posteriori. The Gröbnerbasis algorithm from polynomialideal theory is shownto be capable of solving all relevant problems. Ideas from algebraic geometry prove to be helpful in designing corresponding computer algebraic solution strategies. As an application example some global results on the identifiability of bidirectional reaction steps are derived.
-
Algorithmic Representation of Large RNA Folding LandscapesIn evolutionary processes on the molecularlevel, neutral alterations play an important role [Kim83]. The primary sequence of a DNA or RNA moleculeis changed by errors introduced by the replication machinery at certain rates. Such a mutation does not necessarily affect the phenotype. The molecule’s contribution to the individual’s overall fitness is mediated by its function. For non-encoding RNA, the function in turn is determined by the 3D-structure of the molecule. A mutation in the primary sequence of such an RNA molecule that does notalter its structure is called neutral.
-
A Consensus Match Scoring System thatis Correlated with Biological FunctionalityThe C;-scoring system is suitable for the identification of putative functional transcription factor binding sites solely by sequence analysis. This is a very important feature since it allows preselection of candidate bindingsites for experimental analysis from uncharacterized genomic sequences. Data derived from the analysis of almost 2.4 million nucleotides of genomic sequences show a good correlation of C;-scoring with known biological function. High-scoring bindingsites are clearly over-represented in putative control regions of the genomic sequences. Known functional bindingsites cluster in the same regions. Furthermore, we demonstrate high C;-scores to correlate with biological functionality of 26 individual binding sites for three completely unrelated transcription factors. Consensus matches knownto be either nonfunctional or to bind their correspondingprotein factors with highly reduced affinity are low-scorers in ourrating.
-
Das Gensequenzanalysesystem DIANADIANA (Dna Interactive Artificial Neural-network Analysis) is a software package for the analysis of gene sequences. It allows a precise determination of splicing positions and coding regions in the human genom. The method is based on cascading neural networks, which were specially trained for the identification of human genes. DIANA has got a graphical user interface that is easy to understand. The analysis of 100.000 base pairs takes only a few seconds on a standard workstation. DIANAcanbe extendedfor further organisms.
-
Simulation and Animation of Intracellular DiffusionA computer program was created which allowed for the calculation and animation of intracellular diffusion of molecules which diffused from the external space of tissue into the internal space where they were distributed. Time and spatial distribution of the molecule concentration could be displayed on a PC screen. The program wasused for the interpretation of lidocaine effects on the sodium channels of the membraneof spinal sensory ganglioncells.
-
Statistical Significance of Local Alignments with GapsRecent results on the statistical significance of local alignment with gaps are presented. Parameters necessary for computation of the probability that an alignment achieves a certain score can be approximated by a computationally fast simulation. We present applications to database searching where there-sorting of the output bystatistical significance instead of score leads to improved ability to distinguish sequences homologous to the probe sequence from unrelated sequences.
-
Title, Preface, Contents, List of authorsThe term bioinformatics has two quite distinct meanings. It may describe information handling in living organisms, and it is widely used for the application of computer science to biological problems. It is this second area which is covered in this book. Theseriesof articles presented here represents a selection of the papers given at an invigorating conference on Bioinformatics/Computer Application in the Biosciences, held in October 1995 in Braunschweig at the German National Laboratory for Biotechnology. The development and use of computer applications in the biological sciences, thoughinitiated rather late compared to the situation in physics and chemistry, has reached a high standard nowadays and has becomean indispensable part of any research in this area. A strong impetus has come from modern gene sequencing projects and also from the rapid developmentin the field of structural biochemistry,i.e. the determination of protein and DNA/RNA3D-structures as well as rational protein engineering and design. This is reflected in the subjects coveredin the articles in this book. They describe the present state in this field, in particular the following facts become obvious: - The use and developmentof biological data bases has becomean essential foundation for research in protein science and molecular biology. - Whereas the coding regions of DNA have been the main target of research in the past, nowadays the non-coding regions and RNAare receiving closer attention. - The sequence comparison and correctalignment of protein sequencesis a prerequisite for any protein engineering. Although routinely used in almost all biochemistry laboratories, alignment of sequences with low homology still requiresfurther intensive research so that significantly better results can be producedthan those currently available. - The description and simulation of the interactions between different biological molecules will be one of the fascinating areas of future research. - In addition to understanding the biological processes on a molecular level, we have to simulate the metabolism in the living cell in order to achieve real metabolic design for the optimal biotechnological production of compounds. Whereasthe first development of these methods stems from the sixties and seventies,it is only recently that biologists, chemists and computer scientists have channelled their expertize into large scale collaborative projects aimed at the advancementin this exciting area. Government programs started, for example in Germany and the UK, have provided extra moneyfor joint projects involving computerscientists and biologists. Together with the rapid progress in modern biology and biotechnology, we can expect to see wide-ranging new developments in bioinformatics in the years to come.
-
Classification of Local Protein Structural Motifs by Kohonen NetworksKohonen networks were used for automatic classification of local structural elements derived from a set of 136 non-homologousproteins. A reduced representation of protein backbones based on dihedral phi and psi angles was employedfor construction of training patterns. Segments of nine residues were transformed into a 16-dimensional description by their angular values. Kohonen-mapping yielded a two-dimensional topological map of representative structural motifs. Helical structures and sheets are located on opposite sides of the feature map, and several intermediate forms are found. The map wasusedto trace protein backbones of two proteins, cytochrome bs and y-IV crystallin, leading to characteristic trajectories in the feature map.
-
Data set heterogeneities and their effects on the derivation of contact potentialsA new class of protein structure prediction methods bases on threading a sequence through a known 3D structure. These techniques offer a means of recognizing similarity in cases of distant evolutionary relationship, where the fold has been conserved to a greater extent than its sequence. The scoring table(s) which are used to evaluate different sequence to structure alignments are derived from a particular database of known structures. The distribution and the reliability of the threading scores is effected by parameters of this database, such as length and composition of the included chains, and the number of alternative alignments. Weinvestigated several compositional aspects of two databases which were used in protein fold recognition studies. Among them are the numberof chains from different folding classes, the amino acid distribution within the classes, and the sequence lengths.
-
3D-Segmentierungstechniken und vektorwertige Bewertungsfunktionen für symbolisches Protein-Protein-DockingThe growing number of known 3D protein structures asks for computing systems predicting whether and where two molecules interact with each other. This requires search for possible docking sites of proteins. Based on results of preprocessing techniques like computation of molecular surfaces and segmentation, a knowledge based control algorithm implemented with the semantic network ERNEST searches for geometrical and chemical complementarity on molecular surfaces, computes coarse docking positions considering steric clash and simple geometric judgement functions. Additionally, ERNEST guides a more detailed analysis of finer calcultations including correlation of geometry and hydrophobicity. The proposed hierarchical system allows to predict completely automatically and in reasonable short computing times possible docking sites for two given proteins. A set of 18 representative examples is discussed.
-
An Algorithm for the Protein Docking ProblemWe have implemented a parallel distributed geometric docking algorithm that uses a new measureforthe size of the contact area of two molecules. The measure is a potential function that counts the “van der Waals contacts” between the atoms of the two molecules ( the algorithm does not compute the Lennard-Jones potential). An integer constant c, is added to the potential for each pair of atoms whose distance is in a certain interval. For each pair whose distance is smaller than the lower boundofthe interval an integer constant c, is subtracted from the potential (cg < c,). The number of allowed overlapping atom pairs is handled by a third parameter N. Conformations where more than N atom pairs overlap are ignored. In our “real world” experiments we have used a small parameter N that allows small local penetration. The algorithm almost always found good (rms) approximations of the real conformations that were among the best five geometric dockings. In 42 of 52 test examples the best conformation with respect to the potential function was an approximation of the real conformation. The running time of our sequential algorithm is in the order of the running time of the algorithm of Norel et al. [NLW+]. Theparallel version of the algorithm has a reasonable speedup and modest communication requirements.
-
Ahnlichkeitsanalyse biologisch aktiver Molekiile mit durch Autokorrelationsvektoren trainierten selbstorganisierenden KartenTopological autocorrelation vectors can be used to estimate similarities of molecular structures. In the following paper we examinedifferent data sets of increasing size and complexity with this measure of similarity. All data sets contain substances with known biologicalactivity on the dopaminergic and benzodiazepine receptors. These two different classes of biological active substances can be separated by self-organizing maps, a kind of neural network well suited for clustering and visualization of similarity. The method is implemented on a massively parallel SIMD computer (MasPar MP-1) which is able to perform this analysis for databases of several thousand substances.
-
Force Field Minimization: Domain Decomposition, Positive Definite Functions, and WaveletsIn force field calculations the 3D-structure of macromolecules is computed by minimization of the total internal energy. The large numberof degrees of freedom causes numerical problems in the optimization procedure evenfor relatively small molecules. The number of free variables is reduced by a domain decomposition method assembling certain groups of atoms into configurational structures with considerably less degrees of freedom. To reduce the amount of computations necessary for a prescribed accuracy, approximations to the energy function with respect to these variables are constructed using methods from the theory of splines, wavelets, and positive definite functions.
-
PIEZOELECTRIC (PZ) IMMUNOSENSORS AND THEIR APPLICATIONSThe recent development of piezoelectric immunosensors is reviewed. The selectivity provided by the biological coatings together with the inherent sensitivity of the PZ devices and the ability to oscillate the crystal in liquid medium have induced a rising interest in this class of sensors. Methods of coating and several applications are reported including microgravimetric immunoassays, microbial assays and gas phase immunosensors.
-
BIOCHEMICAL CHARACTERIZATION OFA HIGHLY SPECIFIC TRIMETHYLAMINE DEHYDROGENASE SUITED FOR THE APPLICATION IN BIOSENSORSA strain of Paracoccus possessing a NAD (P)*-independent trimethylamine dehydrogenase could be isolated by enrichment cultivation. The enzyme was purified up to 4.2 U/mg. Phenazine ethosulfate acts as an electron acceptor, other artificial mediators such as methylene blue or K3[Fe(CN)¢] are inactive. Important biochemical data are: pH-optimum: 9.0; substrates: specific for trimethylamine; dimethylamine, methylamine or $rimethylamine-N-oxide are not converted; Ky(trimethylamine): 6.6*¢10°° M. The calibration curve for trimethylamine (aqueous solution) using a kinetic photometric assay with dichlorophenol-indophenol shows a linearity between 0.1 - 0.8 uM trimethylamine.
-
GRAPHITE BASED BIENZYME SENSORSRobust peroxidase/oxidase sensors based on both chemically modified and unmodified graphite electrodes were prepared. The sensorsacted atelectrode potentials of -0.2 V to 0.2 V vs Ag/AgCl electrode. The Sensors werespecific to glucose, alcohols, choline chloride, L- and D-amino acids and oxalic acid. The sensitivities of sensors ranged from 0.3 - 12 uA/mM and the determination ranges of metabolites were 0.01 - 20 mM. The concentration range of cholesterine determined using a hydrogen peroxide sensor and soluble cholesterol oxidase was 0.01 - 0.26 mM with a sensitivity of 0.25 mA/mM whendetermination was carried out using the in biocatalytic pre-concentration regime. The pH optima andthe longterm stabilities of the sensors depended largely on the properties of the oxidase immobilized
-
BIOSENSOR SYSTEMSUTILIZING NADH AS AN INTERMEDIATEThe combination of dehydrogenases and chemically modified graphite electrodes has proved particularly useful in the developmentof highly selective and sensitive biosensor systems. Amongthe applications given is a flow injection system for the determination of fructose with a linear range of 1 pM - 2 mM.As a means toincrease sensitivity further, substrate recycling was employed in the determination of ATP down to the nano-molar level. A creatine and creatinine assayillustrates the potential of coupled equilibria and coimmobilized enzymes. For the determination ofglucose in whole blood a dialyzer was integrated with a biosensor. The hematocrit dependence on the transfer of glucose through the dialysis membrane was studied.