Smart India Hackathon 2024

AI-Driven Deep-Sea Biodiversity Assessment

Revolutionary eDNA analysis pipeline using machine learning to discover and classify marine life in Earth's most unexplored ecosystems

Project Overview

Addressing the critical challenge of poor database representation for deep-sea organisms in traditional reference databases

The Problem
Traditional bioinformatic pipelines rely heavily on sequence alignment to databases built from terrestrial and shallow-water species. This leads to significant misclassifications and underestimation of deep-sea biodiversity, as reference databases like SILVA, PR2, and NCBI lack comprehensive deep-sea organism representation.
Our Solution
An AI-driven pipeline using deep learning and unsupervised learning algorithms to identify eukaryotic taxa directly from raw eDNA reads. Our system minimizes reliance on reference databases while enabling discovery of novel taxa and reducing computational time through optimized workflows.

Project Showcase

eDNA Sampling Process

AI Pipeline Architecture

Species Classification Results

How Our System Works

A comprehensive AI pipeline that transforms raw eDNA data into actionable biodiversity insights

Data Collection
Environmental DNA samples are collected from deep-sea environments and processed to extract 18S rRNA and COI marker genes for analysis.
AI Processing
Advanced machine learning algorithms including K-means, DBSCAN, and hierarchical clustering analyze sequence data without relying on traditional databases.
Classification
Unsupervised learning identifies and classifies taxa across major groups: Annelida, Arthropoda, Chordata, Cnidaria, Echinodermata, Mollusca, and Porifera.
Discovery
The system estimates abundance, identifies novel taxa, and provides comprehensive biodiversity assessments for conservation and research.

Technical Capabilities

3
ML Algorithms
K-means, DBSCAN, Hierarchical
7
Taxonomic Groups
Major marine phyla coverage
2
Marker Genes
18S rRNA & COI