03/01/2013
Identification of disease variants via homozygosity mapping and investigation of the effects of genomewide homozygosity regions on traits of biomedical importance have been widely applied recently. Nonetheless, the existing methods and algorithms to identify long tracts of homozygosity (TOH) are not able to provide efficient and rigorous regions for further downstream association investigation. We expanded current methods to identify TOHs by defining ‘‘surrogateTOH’’, a region covering a cluster of TOHs with specific characteristics. Our defined surrogateTOH includes cTOH, viz a common TOH region where at least ten TOHs present; gTOH, whereby a group of highly overlapping TOHs share proximal boundaries; and aTOH, which are allelicallymatched TOHs. Searching for gTOH and aTOH was based on a repeated binary spectral clustering algorithm, where a hierarchy of clusters is created and represented by a TOH cluster tree. Based on the proposed method of identifying different species of surrogateTOH, our cgaTOH software was developed. The software provides an intuitive and interactive visualization tool for better investigation of the highthroughput output with special interactive navigation rings, which will find its applicability in both conventional association studies and more sophisticated downstream analyses. NCBI genome map viewer is incorporated into the system. Moreover, we discuss the choice of implementing appropriate empirical ranges of critical parameters by applying to disease models. This method identifies various patterned clusters of SNPs demonstrating extended homozygosity, thus one can observe different aspects of the multifaceted characteristics of TOHs.
 Author:

 Format:


01/01/2011
Let $G=(V,E)$ be a graph and T be a spanning tree of G. We consider the following strategy in advancing in G from a vertex x towards a target vertex y: from a current vertex z (initially, $z=x$), unless $z=y$, go to a neighbor of z in G that is closest to y in T (breaking ties arbitrarily). In this strategy, each vertex has full knowledge of its neighborhood in G and can use the distances in T to navigate in G. Thus, additionally to standard local information (the neighborhood $N_G(v)$), the only global information that is available to each vertex v is the topology of the spanning tree T (in fact, v can know only a very small piece of information about T and still be able to infer from it the necessary treedistances). For each source vertex x and target vertex y, this way, a path, called a greedy routing path, is produced. Denote by $g_{G,T}(x,y)$ the length of a longest greedy routing path that can be produced for x and y using this strategy and T. We say that a spanning tree T of a graph G is an additive rcarcass for G if $g_{G,T}(x,y)\leq d_G(x,y)+r$ for each ordered pair $x,y\in V$. In this paper, we investigate the problem, given a graph family $\mathcal{F}$, of whether a small integer r exists such that any graph $G\in\mathcal{F}$ admits an additive rcarcass. We show that rectilinear $p\times q$ grids, hypercubes, distancehereditary graphs, dually chordal graphs (and, therefore, strongly chordal graphs and interval graphs) all admit additive 0carcasses. Furthermore, every chordal graph G admits an additive $(\omega+1)$carcass (where $\omega$ is the size of a maximum clique of G), each 3sunfree chordal graph admits an additive 2carcass, and each chordal bipartite graph admits an additive 4carcass. In particular, any ktree admits an additive $(k+2)$carcass. All those carcasses are easy to construct in sequential as well as in distributed settings.
 Author:

 Format:


01/01/2010
Background Chronic lymphocytic leukemia (CLL) is the most common adult leukemia. It is a highly heterogeneous disease, and can be divided roughly into indolent and progressive stages based on classic clinical markers. Immunoglobin heavy chain variable region (IgVH) mutational status was found to be associated with patient survival outcome, and biomarkers linked to the IgVH status has been a focus in the CLL prognosis research field. However, biomarkers highly correlated with IgVHmutational status which can accurately predict the survival outcome are yet to be discovered. Results In this paper, we investigate the use of gene coexpression network analysis to identify potential biomarkers for CLL. Specifically we focused on the coexpression network involving ZAP70, a well characterized biomarker for CLL. We selected 23 microarray datasets corresponding to multiple types of cancer from the Gene Expression Omnibus (GEO) and used the frequent network mining algorithm CODENSE to identify highly connected gene coexpression networks spanning the entire genome, then evaluated the genes in the coexpression network in which ZAP70 is involved. We then applied a set of feature selection methods to further select genes which are capable of predicting IgVH mutation status from the ZAP70 coexpression network. Conclusions We have identified a set of genes that are potential CLL prognostic biomarkers IL2RB, CD8A, CD247, LAG3 and KLRK1, which can predict CLL patient IgVH mutational status with high accuracies. Their prognostic capabilities were crossvalidated by applying these biomarker candidates to classify patients into different outcome groups using a CLL microarray datasets with clinical information.
 Author:

 Format:


01/01/2009
A class of singularly perturbed quasilinear diﬀerential equations with discontinuous data is examined. In general, interior layers will appear in the solutions of problems from this class. A numerical method is constructed for this problem class, which involves an appropriate piecewiseuniform mesh. The method is shown to be a parameteruniform numerical method with respect to the singular perturbation parameter. Numerical results are presented, which support the theoretical results.
 Author:

 Format:


01/01/2006
In this paper we introduce a new notion of collective tree spanners. We say that a graph G=(V,E)admits a system of $\mu$ collective additive tree rspanners if there is a system T(G) of at most $\mu$ spanning trees of G such that for any two vertices x,y of G a spanning tree T\in \cT(G) exists such that d_T(x,y)\leq d_G(x,y)+r. Among other results, we show that any chordal graph, chordal bipartite graph or cocomparability graph admits a system of at most log2n collective additive tree 2spanners. These results are complemented by lower bounds, which say that any system of collective additive tree 1spanners must have $\Omega(\sqrt{n})$ spanning trees for some chordal graphs and $\Omega(n)$ spanning trees for some chordal bipartite graphs and some cocomparability graphs. Furthermore, we show that any cchordal graph admits a system of at most log2n collective additive tree (2\lfloor c/2\rfloor)spanners, any circulararc graph admits a system of two collective additive tree 2spanners. Towards establishing these results, we present a general property for graphs, called (\al,r)$decomposition, and show that any $(\al,r)$decomposable graph G with n vertices admits a system of at most $\log_{1/\al} n$ collective additive tree $2r$spanners. We discuss also an application of the collective tree spanners to the problem of designing compact and efficient routing schemes in graphs. For any graph on n vertices admitting a system of at most $\mu$ collective additive tree rspanners, there is a routing scheme of deviation r with addresses and routing tables of size $O(\mu \log^2n/\log \log n)$ bits per vertex. This leads, for example, to a routing scheme of deviation $(2\lfloor c/2\rfloor)$ with addresses and routing tables of size $O(\log^3n/\log \log n)$ bits per vertex on the class of cchordal graphs.
 Author:

 Format:


06/28/2005
The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of hostbacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i) genomics – sequencing and comparative study of genomes to identify gene and genome functionality, (ii) proteomics – identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii) cell visualization and simulation to study and model cell behavior, and (iv) application to the development of drugs and antimicrobial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1) analysis based upon the available experimental wetlab data, (2) the use of mathematical modeling to derive new information, and (3) an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene expression analysis to derive regulatory pathways, the development of statistical techniques, clustering techniques and data mining techniques to derive proteinprotein and proteinDNA interactions, and modeling of 3D structure of proteins and 3D docking between proteins and biochemicals for rational drug design, difference analysis between pathogenic and nonpathogenic strains to identify candidate genes for vaccines and antimicrobial agents, and the whole genome comparison to understand the microbial evolution. The development of bioinformatics techniques has enhanced the pace of biological discovery by automated analysis of large number of microbial genomes. We are on the verge of using all this knowledge to understand cellular mechanisms at the systemic level. The developed bioinformatics techniques have potential to facilitate (i) the discovery of causes of diseases, (ii) vaccine and rational drug design, and (iii) improved cost effective agents for bioremediation by pruning out the dead ends. Despite the fast paced global effort, the current analysis is limited by the lack of available genefunctionality from the wetlab data, the lack of computer algorithms to explore vast amount of data with unknown functionality, limited availability of proteinprotein and proteinDNA interactions, and the lack of knowledge of temporal and transient behavior of genes and pathways.
 Author:

 Format:


01/01/2005
In this paper singularly perturbed semilinear differential equations with a discontinuous source term are examined. A numerical method is constructed for these problems which involves an appropriate piecewiseuniform mesh. The method is shown to be uniformly convergent with respect to the singular perturbation parameter. Numerical results are presented that validate the theoretical results.
 Author:

 Format:


01/01/2000
We prove that clawfree graphs, containing an induced dominating path, have a Hamiltonian path, and that 2connected clawfree graphs, containing an induced doubly dominating cycle or a pair of vertices such that there exist two internally disjoint induced dominating paths connecting them, have a Hamiltonian cycle. As a consequence, we obtain linear time algorithms for both problems if the input is restricted to (claw,net)free graphs. These graphs enjoy those interesting structural properties.
 Author:

 Format:


04/01/1998
In this paper fitted finite difference methods on a uniform mesh with internodal spacing h, are considered for a singularly perturbed semilinear twopoint boundary value problem. It is proved that a scheme of this type with a frozen fitting factor cannot converge epsilonuniformly in the maximum norm to the solution of the differential equation as the mesh spacing h goes to zero. Numerical experiments are presented which show that the same result is true, for a number of schemes with variable fitting factors.
 Author:

 Format:


06/01/1996
Boundary value problems for singularly perturbed semilinear elliptic equations are considered. Special piecewiseuniform meshes are constructed which yield accurate numerical solutions irrespective of the value of the small parameter. Numerical methods composed of standard monotone finite difference operators and these piecewiseuniform meshes are shown theoretically to be uniformly (with respect to the singular perturbation parameter) convergent. Numerical results are also presented, which indicate that in practice the method is firstorder accurate.
 Author:

 Format:

