DOE Genomes
Human Genome Project Information  Genomics:GTL  DOE Microbial Genomics  home
-

Genomes to Life Contractor-Grantee Workshop II
February 29-March 2, 2004, Washington, D.C.

Genomics:GTL Program Projects


Lawrence Berkeley National Laboratory

Rapid Deduction of Stress Response Pathways in Metal/Radionuclide Reducing Bacteria

2

VIMSS Computational Microbiology Core Research on Comparative and Functional Genomics

Adam Arkin1,2,3 (aparkin@lbl.gov), Eric Alm1, Inna Dubchak1, Mikhail Gelfand4, Katherine Huang1, Kevin Keck1, Frank Olken1, Vijaya Natarajan1, Morgan Price1, and Yue Wang2

1Lawrence Berkeley National Laboratory, Berkeley, CA; 2University of California, Berkeley, CA; 3Howard Hughes Medical Institute, Chevy Chase, MD; and 4Research Institute for the Genetics and Selection of Industrial Microorganisms, Moscow, Russa

The primary roles of the Computational Core are to curate, analyze, and ultimately build models of the data generated by the Functional Genomics and Applied Environmental Microbiology Core groups. The near-term focus of the computational group has been to build the scientific and technical infrastructure necessary to carry out these roles. In particular, the efforts of the computational group have been directed toward three objectives: genomics and comparative genomics, curation and analysis of experimental data from the other core groups, and modeling. Central to each of these goals has been the development of a comprehensive relational database that integrates genomic data and analyses together with data obtained from experiment.

VIMSS DB. At present, well over 100 microbial genomes have been sequenced, and hundreds more are currently in the pipeline. Despite this fact, tools to explore this wealth of information have focused on individual genome sequences. The VIMSS Comparative Genomics database and web-based tools are designed to facilitate cross-species comparison, as well as to integrate experimental data sets with genome-scale functional annotations such as operon and regulon predictions, metabolic maps, and gene annotations according to the Gene Ontology. Over 130 complete genome sequences are represented in the VIMSS Comparative Genomics Database, which is implemented as a MySQL relational database, a Perl library for accessing the database, and a user-friendly website designed for laboratory biologists (http://escalante.lbl.gov). This database is currently being augmented with a novel graph for the efficient query of biological pathways and supporting data. A generic java-based tool for the graphical construction of queries on representations of relational database schema (particular for pathways) in nearly finished and will be applied to VIMSS DB in first quarter 2004.

Web-Based Tools. The VIMSS Comparative Genome Browser allows users to align any number of genomes and identifies predicted orthology relationships between genes. Users can save genes of interest for use in the VIMSS Bioinformatics Workbench (VBW), explore individual genes in depth for information about sequence domains, BLAST alignments, predicted operon structure and functionally related genes inferred from a combination of comparative genomics methods and microarray experiments. The VertiGO comparative gene ontology browser allows users to simultaneously view the genetic complement of any number of genomes according to the Gene Ontology hierarchy. A metabolism browser based on the KEGG metabolic maps allows browsing either the set of enzymes predicted to be present in a single genome, or a comparison highlighting the metabolic differences between two genomes. VBW allows users to create and save lists of genes of interest, and use these lists to investigate phylogenetic relationships by making multiple sequence alignments and phylogenetic trees, as well as apply DNA motif-finding software to identify potential regulatory elements in upstream sequences. Novel motif finding algorithms exploiting the comparative analysis of orthologous proteins have already been accurately difficult motifs such as those from the merR family of regulators of heavy-metal resistance.

Genome Annotation. One of the stated goals of the GTL program is to produce next-generation annotation of target genomes including automated gene functional annotations and prediction of gene regulatory features along with validation of these in silico methods. The most fundamental unit of gene regulation in bacteria is the operon, which is a set of genes that are cotranscribed on a single RNA transcript. Because few operons have been characterized experimentally outside the model organisms E. coli and B. subtilis, in silico operon prediction methods have been validated only in these two organisms. We have therefore made accurate and unbiased operon predictions in all bacteria a priority for the computational group. To avoid bias that might arise from using experimental data from only two organisms, we have opted to avoid the use of experimental data entirely using techniques from the field of unsupervised machine learning, and we used gene expression data to estimate the accuracy of our predictions. Key to the success of this approach has been integrating experimental data from the Functional Genomic Core group into our Comparative Genomics Database to validate our in silico procedures. Using our operon prediction tool, we have established that, contrary to reports in the literature, the bacterium Helicobacter pylori has a large number of operons. In addition, by examining unusually large non-coding regions within highly conserved operons, we have identified putative pseudogenes in Bacillus anthracis that allow us to make phenotypic predictions about the motility of the sequenced Ames strain. As a critical test of our automated genome annotations, we are hosting a genome annotation jamboree in April at the Joint Genome Institute, in which our automated predictions will be verified by human curators. We expect that our annotations, along with confidence levels, will reduce the manual curation workload allowing participants to focus most of their efforts on scientific hypothesis testing.

Functional Genomics. The Functional Genomics Core group is beginning to produce large data sets detailing the response of our target organisms to a variety of stress conditions. The Computational Core group is charged with the responsibility to: store and redistribute these data; assist in the statistical analysis and processing of raw data; and to facilitate comparison of experiments performed with different experimental techniques, different conditions, or different target organisms. As a test case, we have focused most of our efforts in this direction toward gene expression microarray experiments. Among the challenges in the representation of microarray data is developing a data schema that includes both raw and processed data, metadata describing the experimental conditions, and a technical description mapping, for example, each array spot to a corresponding region of the genome sequence and to the set of annotated genes (and their orthologs in other species). We are actively following the development of standards for the representation of this type of data (see Data Management below), and in the meantime have implemented our own simple formats aimed at quick integration with our Comparative Genomics Database. To interpret the results of these experiments, it was necessary to develop a standard set of procedures for data normalization and significance testing and apply it uniformly to raw data from each experiment set, as processed data from different labs commonly involve slightly different analytical techniques. By establishing common methodologies, and a common repository for different experimental results, we were able to meet the goal of facilitating comparative studies as well as using the functional genomic data to test hypotheses generated from our comparative genomic analysis. The methods have been applied to the analysis of pH, salt and heat stress data from Shewanella oneidensis. Results from this analysis will be described.

Data Management. During the first year of the project, laboratories in the project began putting in place experimental procedures and are now beginning to produce substantial amounts of data. There is a critical need to define what descriptions of data and experimental procedures (protocols) and factors need to be developed and captured, and to put in place procedures for documenting and recording that information. Recognizing this need, we are in the process of reviewing how experimental procedures are being documented and how experimental factors are being recorded by LBNL affiliated laboratories. This information will be used not only to facilitate information and data acquisition procedures, but also to to enhance and upgrade the BioFiles system for data uploading and the underlying database management system. Working with a consortium of researchers from the wider GTL community we have produced a report on the current status of National Data standards and their advantages and deficiencies and produced a plan for developing standardization of metadata and data representation.

3

Managing the GTL Project at Lawrence Berkeley National Laboratory

Nancy A. Slater (naslater@lbl.gov)

Lawrence Berkeley National Laboratory, Berkeley, CA

The effective management of the GTL systems biology project at Lawrence Berkeley National Laboratory (LBNL) is essential to the success of the project. The comprehensive management plan for the project includes milestone planning and project integration, a plan for communicating and collaborating with the project stakeholders, financial management and website updates. In addition, the management plan incorporates reviews by committees, including a monthly Executive Committee review comprised of LBNL leadership, an annual Scientific Advisory Committee review, a biannual Technical Advisory Panel review to ensure that the project’s technical development is aligned with related DOE efforts, and a monthly Steering Committee conference call where the project leaders discuss the project’s progress and status.

A key responsibility in the project management process is troubleshooting problems related to the scientific and financial management of the project. There is a delicate balance between having adequate resources to achieve the scientific objectives of the project and working within the funding levels of the project. If an area is falling behind on achieving their scientific milestones, the project manager must work closely with the researchers to resolve problems as efficiently and effectively as possible.

Milestone Planning and Project Integration

A detailed list of project deliverables and milestones is updated by the PIs at the beginning of each fiscal year. The process of updating and reviewing milestones ensures that the goals of each PI are aligned with the overall goals of the project. These milestones are the basis for an integrated project schedule, which is managed using Microsoft® Project. The project schedule is updated monthly, and progress is reported through progress reports and teleconferences with the PIs. The updated project schedule is posted to the project website, so that all of the collaborators have access to the most recent status of the project.

The project is divided into three separate Core groups, and the integration plan for the project assures that the Core groups work together toward the objectives of the project. The Core Research group leaders are responsible for ensuring smooth operation of their section of the project as well as cooperation with the other groups. For example, the Applied Environmental Microbiology leader is responsible for ensuring that cell culture protocols are acceptable to the Functional Genomics Core, who will ultimately use the cell cultures for experiments. The Functional Genomics Core leader is responsible for ensuring quality control for data production and timely data uploads into the database. The Computational Core leader is responsible for ensuring that data entry, querying, and curation interfaces serve the needs of the other groups, and that the models are useable to biologists outside of the modeling group. The success of each group is interdependent on a well-integrated project team.

Communication and Collaboration

The GTL project at LBNL is a collaborative effort between seven institutions, thirteen researchers and their associated laboratories. The project’s communications plan consists of a variety of media, including a project website, monthly group meetings, conference calls, an annual retreat, workshops at conferences, and monthly progress reports.

The monthly group meetings include a presentation from one of the Core Research groups, and it is attended by the local, northern California GTL project team members. There are a several conference calls that are held on a regular basis, including a monthly Steering Committee meeting in which all of the researchers participate, a monthly BioFiles conference call in which a representative from each laboratory discusses data generation, uploads and handling, and a quarterly conference call with DOE. The LBNL project has an annual retreat in which the researchers present data and findings related to their area of focus and other laboratory team members (Computer Science Engineers, Microbiologists, Database Managers, Graduate Students, Post Docs, etc.) present posters in a poster forum. The annual retreat has proven to be very successful in building working relationships among the dispersed group. The LBNL GTL project will be participating in several workshops at international conferences in 2004. The monthly progress reports are comprised of input from each researcher, and include updates regarding the status of the milestones, planned work and problems/issues that they encountered.

Website Updates

The GTL project at LBNL is the inaugural project for the Virtual Institute of Microbial Stress and Survival (VIMSS), and details regarding the project are located on the world wide web at http://vimss.lbl.gov. This website serves as a tool for communicating the status of the project as well as:

Financial Management

Each of the researchers provides input into the annual spend plan for the project. The finances of the project are tracked on a continuous basis, and the researchers receive monthly reports showing actual costs verses the spend plan. The finances of the project are maintained using software packages at LBNL as well as spreadsheets and charts. These tools allow the Project Manager to identify spending trends, so that appropriate can be taken to keep the project aligned with the annual spend plan. The Executive Committee reviews the project financial reports monthly.

4

VIMSS Applied Environmental Microbiology Core Research on Stress Response Pathways in Metal-Reducers

Terry C. Hazen*1 (TCHazen@lbl.gov), Hoi-Ying Holman1, Sharon E. Borglin1, Dominque Joyner1, Rick Huang1, Jenny Lin1, David Stahl2, Sergey M. Stolyar2, Matthew Fields3, Dorothea Thompson3, Jizhong Zhou3, Judy Wall4, H.-C. Yen4, and Martin Keller5

*Presenting author

1Lawrence Berkeley National Laboratory, Berkeley, CA; 2University of Washington, Seattle, WA; 3Oak Ridge National Laboratory, Oak Ridge, TN; 4University of Missouri, Columbia, MO; and 5Diversa Corporation, San Diego, CA

Field Studies

Sulfate-reducing bacteria.: Sediment samples from different depths at the NABIR Field Research Center in the background, Areas 1, 2, and 3 sites have been used for the enrichment of sulfate-reducing microorganisms. Sulfate-reducing enrichments have been positive for sediments in Areas 1 and 2 when lactate or acetate were used as electron donors, and some of the enrichments differ in the capacity to reduce cobalt, chromium, and uranium. Groundwater enrichments from Areas 1, 2, and 3 all displayed sulfate-reduction with different electron donors (lactate, butyrate, acetate, pyruvate) and these enrichments could also reduce iron, cobalt, and chromium. Subsurface sediments from the wells FWB-107 (13.2 m) and FWB-109 (15.4 m) in Area 3 were serially diluted in a basal salts medium that contained lactate and ethanol with different electron acceptors. The results suggested that in the sampled sediments (13 to 15 m) nitrate-reducers were approximately 3500 to 5400 cells/g, iron-reducers 50 to 1700 cells/g, and sulfate-reducers 240 to 1100 cells/g. The predominant population (25%) of the 10-2 sulfate-reducing dilution had 88% sequence identity with Desulfosporosinus blif. Subpopulations that had 95% to 97% sequence identity with Desulfosporosinus orientis constituted for an additional 37% of the library. Other clones had 98% sequence identity with Clostridium chromoreductans.

Clone libraries. Since stress response pathways are clustered on chromosomal DNA fragments and generally vary in length from 20-40 kb, it is essential to clone large DNA fragments to capture entire pathways. We have developed effective DNA extraction methods and vector/host systems that allow stable propagation of large DNA fragments in E. coli. Processed environmental samples are embedded in agarose noodles for protein digestion and release of high molecular weight DNA. In stressed environments, organism concentrations are often very low, so we have developed a method for increasing the concentration of large DNA by amplification with a phage polymerase. After amplification, the DNA is partially digested with restriction enzymes, and size-selected by agarose gel electrophoresis. It is then ligated to fosmid arms and packaged into phage lambda particles that are used to infect E. coli. The microbial diversity of the libraries is determined with Terminal Restriction Fragment Polymorphism (T-RFLP). Large fragment DNA has been extracted and amplified from 15 NABIR FRC samples (comprising 3 areas at various depths). Small insert DNA libraries have been constructed from most of these samples, and large insert DNA libraries are in various stages of construction. T-RFLP and DNA sequencing are being used to quality control the resulting libraries.

Enrichments. Seven Desulfovibrio strains were isolated from lactate-sulfate enrichment of sediment taken from the most contaminated region of Lake DePue, IL. Their 16S rRNA and dsrAB genes were amplified and sequenced. They all were identical to each other and virtually identical to the corresponding genes from D. vulgaris Hildenborough. One mismatch was observed in the16S rRNA gene and one in dsrAB. Different fragment patterns confirmed that the DePue isolates were similar but not identical to D. vulgaris Hildenborough. Pulse field electrophoretic analysis of I-CeuI digests revealed that both isolates had five rRNA clusters, the same as D. vulgaris Hildenborough. However, the length of one chromosomal segment in the DP isolates was considerably shorter than the corresponding fragment from D. vulgaris Hildenborough, suggesting the presence of a large deletion in the genomes of the isolates (or insertion in D. vulgaris Hildenborough).

Culture and Biomass Production

Defined Media – Growth. A defined medium for optimal growth and maximum reproducibility of Desulfovibrio vulgaris was developed for biomass production for stress response studies. The medium was optimized by evaluating a variety of chemical components, including the removal of yeast extract, excess sulfate, and Fe, and redox conditions to optimize cell density and generation times, and to reduce lag times. Growth was monitored using direct cell counts, optical density, and protein concentration. The generation time for D. vulgaris in the original Baar’s medium was 3 h, reaching a maximum density of 108 cells/ml and 0.4 OD600 nm. The generation time for D. vulgaris on LS4D was 5 h, with a maximum cell density of 109 cells/ml and a 0.9-1.0 OD600 nm. LS4D is well suited for the monitoring protocols, as well as the equipment and large scale processing needed for biomass production.

Dual culture systems. Co-cultures of two different Desulfovibrio species (Desulfovibrio vulgaris Hildenborough and Desulfovibrio sp.PT2) syntrophically coupled to a hydrogenotrophic methanogen (Methanococcus maripaludis) on a lactate medium without sulfate has been established and characterized. No appreciable growth was observed in 50 mM lactate for single-organism cultures. Following optimization of the ionic composition (MgCl2 and NaCl) of the medium, stable co-cultures were established having generation times of 25h-1 and 35 h-1 for D. vulgaris and Desulfovibrio sp. PT2 co-cultures respectively. Both co-cultures degraded lactate to acetate, methane, and carbon dioxide. No other organic acids were detected during the course of experiments. Approximately 1mol of acetate and 1mol of methane was produced from two mole of lactate by both co-cultures during most active period of growth. The stability of established methanogen-SRBs co-cultures (Desulfovibrio vulgaris or Desulfovibrio sp. PT2 with M. maripulidis) was confirmed by serial transfer (six times).

Biofilm reactors. Initial characterization of Desulfovibrio vulgaris growth as a biofilm was evaluated using a 600ml biofilm reactor containing 3mm glass beads as growth substratum and the B3 culture medium (16mM lactate and 28 mM sulfate). The ratio of flow rates through an internal recirculation loop to influent was maintained at 100:1, evaluating two different influent flow rates (0.5ml/min or 30ml/hr). Formation of a loose biofilm was associated with significant gas accumulation within the reactor. The system in now being modified to incorporate a gas trap in the re-circulation loop.

FairMenTec (FMT) chemostat. A pilot run with Desulfovibrio vulgaris Hildenborough in the FMT bioreactor in chemostat mode was completed. The bioreactor was operated using the LS4D medium with 45mM lactate, 50 mM sulfate, and Ti-citrate at 1/3 standard formulation (subsequent batch cultures have shown improved growth with further reduction of the Ti-citrate to 1/6 standard formulation). Varying flow rates and medium compositions were evaluated.

Oxygen Stress Experiments

Protocols. Since episodic exposure to air or oxygenated ground water is common at contaminated sites, we decided to focus on oxygen stress of D. vulgaris for our initial studies. To accommodate the all the investigations that would require simultaneous harvesting of biomass for studies on proteomics, transcriptomics, metabolomics and phenotypic studies a batch culture system was developed for 2000 ml cultures that could be sparged with nitrogen or air to control stress in water baths using rigorous quality control on culture age, sampling, defined media, chain of custody, and harvesting times and techniques.

Phenotypic responses. Desulfovibrio vulgaris enters a new phenotypic state when confronted with a sudden influx of oxygen. Using SEM and TEM microscopy we observed that during the first 24-72 h of exposure to air D. vulgaris cells are negatively aerotactic, gradually they loose their flagella, and begin to elongate, by 20 days exposure they are 3-4 times larger and have a well developed exopolysaccharide sheath. At all times the cells were viable and recovered when put back under anaerobic conditions. Real-time analysis using Synchrotron Fourier Transform Infrared Spectromicroscopy enabled us to determine quantitative changes in peptides and saccharides in the living cells during exposure to air, thus providing the exact timing of cell changes in the stress response. During the early phase of the exposure, we observed decreases in total cellular proteins as well as changes in the secondary structures of proteins that are indicative of the changing of the local hydrogen-bonding environments and the presence of granular protein. During the late phase of the exposure, we observed the production of polysaccharides, concomitant with the production of the external sheath. The S-FTIR also demonstrated that the cells were viable within the sheath at 20 days exposure. Phospholipid fatty acid (PLFA) analysis confirmed that no biomass was loss during air sparging of stationary phase cells. In addition, no change in the PLFA patterns were observed during air sparge, indicating neither cell growth nor death occurred. The PLFA extraction is being developed as a method for routine monitoring of cultures during biomass production and stress studies. Databases of lipid signatures of D. vulgaris during various growth conditions are being developed to augment the information produced from other VIMSS collaborators on proteomics and functional genomics.

5

VIMSS Functional Genomics Core: Analysis of Stress Response Pathways in Metal-Reducing Bacteria

Jay Keasling*1 (keasling@socrates.berkeley.edu), Steven Brown4, Swapnil Chhabra2, Brett Emo3, Weimin Gao4, Sara Gaucher2, Masood Hadi2, Qiang He4, Zhili He4, Ting Li4, Yongqing Liu4, Vincent Martin1, Aindrila Mukhopadhyay1, Alyssa Redding1, Joseph Ringbauer Jr.3, Dawn Stanek4, Jun Sun5, Lianhong Sun1, Jing Wei5, Liyou Wu4, Huei-Che Yen3, Wen Yu5, Grant Zane3, Matthew Fields4, Martin Keller5 (mkeller@diversa.com), Anup Singh2 (aksingh@sandia.gov), Dorothea Thompson4, Judy Wall3 (wallj@missouri.edu), and Jizhong Zhou4 (zhouj@ornl.gov)

*Presenting author

1Lawrence Berkeley National Laboratory, Berkeley, CA; 2Sandia National Laboratories, Livermore, CA; 3University of Missouri, Columbia, MO; 4Oak Ridge National Laboratory, Oak Ridge, TN; and 5Diversa Corporation, San Diego, CA

Introduction: Environmental contamination by metals and radionuclides constitutes a serious problem in many ecosystems. Bioremediation schemes involving dissimilatory metal ion-reducing bacteria are attractive for their cost-effectiveness and limited physical detriment and disturbance on the environment. Desulfovibrio vulgaris, Shewanella oneidensis, and Geobacter metallireducens represent three different groups of organisms capable of metal and radionuclide reduction whose complete genome sequences were determined under the support of DOE-funded projects. Utilizing the available genome sequence information, we have focused our efforts on the experimental analysis of various stress response pathways in D. vulgaris Hildenborough using a repertoire of functional genomic tools and mutational analysis.

Transcript analysis: D. vulgaris is a d-Proteobacteria with a genome size of approximately 3.6 Mb. Whole-genome microarrays of D. vulgaris were constructed using 70-mer oligonucleotides. All ORFs in the genome are represented with 3,471 (97.1%) unique probes and 103 (2.9%) non-specific probes that may have cross-hybridization with other ORFs. The microarrays were employed to investigate the global gene expression profiles of D. vulgaris in response to elevated salt and nitrite concentrations as well as exposure to oxygen. Approximately 370 ORFs were up-regulated (≥3-fold) and 140 ORFs were down-regulated when D. vulgaris cells were treated with 0.5 M NaCl for 0.5 hour. For example, genes involved in glycine, betaine, or proline transport were up-regulated 5-, 19- and 26-fold, respectively. Almost half of those genes with significant changes in expression are predicted as conserved hypothetical or hypothetical proteins. After 4-hour treatment, approximately 140 ORFs were up-regulated and more than 700 ORFs were down-regulated. Patterns of gene expression were distinctly different between time points. With 1 mM nitrite, D. vulgaris exhibited a lag phase of 28 h compared to a 5 h lag phase in controls without nitrite addition. Strong nitrite treatment (5 or 10 mM) triggered a transient growth arrest and growth resumed gradually after 5 hours, suggesting the ability of D. vulgaris to overcome the toxicity of nitrite. Transcriptional profiling analysis was carried out following nitrite (10 mM) treatment. Transcripts highly up-regulated throughout the 5 h following nitrite shock included genes encoding two iron-sulfur cluster-binding proteins (65- and 15-fold) and a hybrid cluster (Fe/S) protein (24-fold). All three ORFs are annotated as redox-active proteins, and the hybrid cluster protein has been specifically proposed to participate in nitrogen metabolism. Surprisingly, the nitrite reductase genes were only moderately up-regulated (3-fold) as well as the formate dehydrogenase genes.

Protein analysis: A combination of Differential In-Gel Electrophoresis (DIGE), Isotope-Coded Affinity Tags (ICAT), and comprehensive proteome analyses were used to investigate the response of the D. vulgaris proteome to heat shock and O2 stress. DIGE analysis of heat-shock stress response identified a total of 650 proteins. Sixty-three (63) proteins showed differences between the heat shocked (30 min) and control conditions. Using the complementary ICAT analysis we were able to identify a total of 219 proteins out of the D. vulgaris proteome. Out of this pool of proteins, 7 stress related proteins were identified. Similar analysis was also done with O2-stressed cells. Based on cysteine containing tryptic peptides, a total of 92 proteins were identified. Among the identified proteins, 40 showed differences between the O2-stressed and control conditions and of these at least 6 are known to be involved in O2-stress response. Total comprehensive proteome analysis of D. vulgaris was also used to investigate differential protein expression induced by O2-stress. Cellular tryptic-digested proteins from control and stressed cultures were analyzed by 3D µLC-MS-MS. A total of 1,791 unique proteins were identified.

Protein complex analysis: Based on the preliminary DIGE analysis of heat shock response in D. vulgaris, HSP70 (ORF00281) was identified as being involved in this stress condition. Western analysis using antibodies to the E. coli homolog (63% sequence identity) showed enhanced production of ORF00281 (Hsp70). The Anit-HSP70 antibody was then used to study bait-prey interactions in whole cell protein extracts from the heat shock condition using the Co-Immunoprecipitation kit for immobilization. Approximately 7 “pulled down” proteins bands were observed as possibly interacting proteins with HSP70. These bands were gel extracted and further analyzed by LC-MS-MS. To generate tagged proteins for identifying protein complexes in D. vulgaris, we have also explored the application of the IBA Strep-tag vector system for generating single chromosomal copies of genes fused to the tag sequence. We have generated a fusion of dnaK with the tag and have it integrated into the chromosome of D. vulgaris in single copy to determine the effectiveness of this system for providing complexes for proteomics analysis.

Metabolite analysis: We have developed a hydrophilic interaction chromatography method coupled to MS/MS detection to separate and identify nucleotides and redox cofactors. In addition, CE-MS methods were developed to analyze a variety of metabolites, including amino acids, nucleic acid bases, nucleosides, nucleotides, organic acid CoAs, redox cofactors, and the metabolic intermediates of glycolysis, the TCA cycle and the pentose phosphate pathway. All the methods were validated using E. coli cell extracts. Approximately 100 metabolites can be separated and identified. The development of an efficient method to obtain D. vulgaris metabolite extracts and its application to analyze stress responses in D. vulgaris are in progress.

Development of a genetic system: In efforts to improve the genetic versatility of D. vulgaris, spontaneous mutants resistant to either nalidixic acid or rifampicin were selected. These antibiotic resistances will allow counter-selection of sensitive E. coli donors in conjugation experiments. Additional effort has been made to screen antibiotic sensitivity and resistance of D. vulgaris. The wild type was sensitive to G418 (400 µg/ml), ampicillin (20-50 µg/ml), carbinicillin (20-50 µg/ml) and resistant to gentamycin. The drug resistance markers present on many routinely used cloning vectors confer resistance to these antibiotics. Marker exchange mutagenesis of a number of regulatory genes is in progress by a procedure that will introduce molecular barcodes into the deletion sites. Sucrose sensitivity will be used to enrich for the second recombination event necessary to delete the wild-type copies of the target genes. Interestingly, we found that sucrose sensitivity is not expressed well in all Desulfovibrio strains. To further streamline methods for gene knockout, a vector system that uses a single cross-over event for gene deletion has been created. A 750-bp internal gene sequence flanked by 20 base pair UP and DOWN barcodes will used to simultaneously knock out and barcode each gene. Conjugal transfer using E. coli will be used to transform D. vulgaris with the suicide knockout vectors. The single cross-over gene deletion system also attempts to address issues of polar mutations. Additionally, a lacZ reporter will be incorporated into the site of gene deletion. Methylumbelliferyl b-D-galatoside, a fluorescent substrate the b-galactosidase reporter will be used for colony screening under anaerobic conditions. Finally, experiments to generate a library of transposon mutants are also underway. Putative mutants have been generated and will be screened for the presence and copy number of the transposon, stability of the antibiotic resistance, and randomness of the insertion.