Smart India Hackathon 2024

AI-Driven Deep-Sea Biodiversity Assessment

Revolutionary eDNA analysis pipeline using machine learning to discover and classify marine life in Earth's most unexplored ecosystems

Project Overview

Addressing the critical challenge of poor database representation for deep-sea organisms in traditional reference databases

The Problem

Traditional bioinformatic pipelines rely heavily on sequence alignment to databases built from terrestrial and shallow-water species. This leads to significant misclassifications and underestimation of deep-sea biodiversity, as reference databases like SILVA, PR2, and NCBI lack comprehensive deep-sea organism representation.

Our Solution

An AI-driven pipeline using deep learning and unsupervised learning algorithms to identify eukaryotic taxa directly from raw eDNA reads. Our system minimizes reliance on reference databases while enabling discovery of novel taxa and reducing computational time through optimized workflows.

Project Showcase

eDNA Sampling Process

AI Pipeline Architecture

Species Classification Results

How Our System Works

A comprehensive AI pipeline that transforms raw eDNA data into actionable biodiversity insights

Data Collection

Environmental DNA samples are collected from deep-sea environments and processed to extract 18S rRNA and COI marker genes for analysis.

AI Processing

Advanced machine learning algorithms including K-means, DBSCAN, and hierarchical clustering analyze sequence data without relying on traditional databases.

Classification

Unsupervised learning identifies and classifies taxa across major groups: Annelida, Arthropoda, Chordata, Cnidaria, Echinodermata, Mollusca, and Porifera.

Discovery

The system estimates abundance, identifies novel taxa, and provides comprehensive biodiversity assessments for conservation and research.

Technical Capabilities

ML Algorithms

K-means, DBSCAN, Hierarchical

Taxonomic Groups

Major marine phyla coverage

Marker Genes

18S rRNA & COI