[SiliconValley Winter Bootcamp] CropDoctor
Diagnose diseased crops with AI
Demo video💡
What is CropDoctor? 🪴
Our team began this project with the idea that “While the pet and plant industries are growing, There are many animal hospitals but few plant hospitals.” to “Wouldn’t inexperienced farmers need a service to diagnose diseases in crops?”
The service is simply described as follows:
- When a crop appears to be ill but the illness is unknown, Users can use this service. Crop photographs can be uploaded to CropDoctors to diagnose illnesses, find cures and causes, and manage them.
- The history of previously identified crops is shown by historical records.
- By collecting crop statistics and monthly statistics on all users in advance, users can prevent the spread of disease to crops.
- The responsive design makes it usable not only on desktops but also on tablets and mobile devices. (CropDoctor was created as a reactive web application because of the way it functions, which requires uploading photos; It should be used on mobile devices.)
Also, to explain each page, it is as follows:
- Main page : After selecting the crop category, You can upload photos and press the Diagnostic button to go to the Diagnostic Results page.
- Result page : It tells you whether the crop is normal or abnormal, and if it is abnormal, You can see the cause, symptoms, and treatment of the disease.
- Login page : You can Login and Sign up. If you log in and diagnose, you will have a history record; If you do not, you will not have a record.
- History page : There is a previous diagnostic record, and if there is one you want to delete, You can do so. If you don’t login, you can’t enter the page.
- Statistics page : Diagnostic statistics of all users can be visualized and viewed, and necessary information can be obtained through a summary.
System Architecture 🛠
Our team first set up and started the environment and library needed to implement the project before development.
The CropDoctor system architecture is shown in the image above.
NGINX was used as a web server.
Frontend 🖥️
On the Frontend, Vite, which reflects modifications of modular components in real time, was used, and React, which is component-based and easy to reuse code, was selected. Also, TypeScript was used to more clearly convey the purpose of variables or functions intended by developers, as well as to receive feedback such as automatic code completion or error notification of incorrect variable or function use based on the delivered information. Tailwind was used for fast CSS styling.
Backend ⚙️
On the Backend, Django REST framework was used to design the API, and Swagger automatically documented the developed API.
Because Django is an asynchronous API framework, the task queue should be processed when using AI models. To solve this, we put RabbitMQ and Celery together and adopted AI as the YOLO v5
Login, Disease, and previous diagnostic information were stored in the S3 Bucket, And the database used MySQL.
To monitor the environment in use, E.L.K, Prometheus, and Grafana libraries were used, and an alarm was set to sound on the Slack if an abnormality occurred under the specified conditions.
Each system was containerized using Docker and managed without concern for dependencies caused by various library installations. Docker-Compose was used to connect the containers. Finally, continuous distribution was accomplished in AWS EC2 using GitHub Action.
Tech Stack 📚
Frontend : Vite, React, TypeScript, Tailwind
Backend : Django, Gunicorn, Rabbitmq, Celery, Mysql
AI : Pytorch(YOLO v5)
Etc : AWS(Route53, EC2, Load Balancer, S3 Bucket, RDS), Github Action,
Swagger, Docker-Compose, Slack, ELK, Prometheus, Grafana, Jira
AI Algorithm 🪄
We needed a model to locate crops and diseases in images (bounding box regression) and to classify if crops have diseases.
YOLO is a combination of the two above. It was selected because it showed better performance in terms of speed than the 2-stage method performed separately.
YOLO Modeling Process
- Divide images into N x N grids
- Predict multiple bounding boxes where objects are likely to be located
- Extract confidence for each bounding box (Confidence is the value of whether a class exists)
Learning process
Convert to YOLO format because the learning data format and the YOLO format are not the same.
- Adjust because of the presence of an imbalance by class in learning data → Normal data is more than disease data
- Data Learning (Hyperparameter)
epochs: 50
batch_size: 6
imgsz: 640
optimizer: SGD
The graph below depicts the accuracy of each disease class for the cucumber model, which is one of six crop models.
The location of crops and disease is determined for each model as a result of the six learned models.
Monitoring Tools 💻
Prometheus was used to collect hardware resources. RAM memory and SWAP memory capacity were also collected due to the issue of poor server operation because of insufficient RAM capacity. Additionally, the number of http requests was collected.
The metrics collected in this way were visualized through Grafana. Also, Grafana was linked to Slack so that Slack would be alerted if the memory exceeded 85 percent.
Logstash was used to collect and store logs. Because logs are only collected from one Nginx, Nginx’s log folder was created and saved without the use of Redis.
We visualized logs as a graph using Kibana. The line graph on the left shows the number of accesses by time, and the circle graph on the right shows the request keyword.
Features 🎥
The Id in email format and password can be registered as a member only when the conditions are met, and you can log in by entering the Id and password with which you have been registered as a member.
First, select the crop from the category and upload the picture. After that, press the diagnostic button to view the diagnostic results after a brief loading. The diagnosis results show the name of the disease, causes, symptoms, and solutions.
If you diagnose it when you are logged in, it is stored, and you can check the crops inquired about on the History page. The diagnostic function is available even when not logged in, but the history function is not.
With a function that is only available when logging in, you can collect crops that you have diagnosed. You can check crops, delete them, and find each one by using the category.
By pressing the image, you can check the detailed information in the modal window.
You can check the number of diseases by crop and the number of crop diseases by period.
By clicking the button below, you can choose which charts to view and view additional chart summaries.
There are categories in the crop disease count chart by period, so you can check statistics by crop.
We predicted that many CropDoctor users would take photos with their phones and upload them directly to the website. Therefore, it was made into a responsive website to make it easier to see on mobile. The menu bar at the top changes to a hamburger menu when the screen gets smaller.
Error we met and fixed 🩹
HTTPS, Nginx error
We had an issue with HTTPS implementation on EC2 instances on AWS. The domain's certificate was issued through AWS Certificate Manager, but the health status of the target group on port 80 was showing as "Unhealthy," leading to 404 errors when requests were sent to the specified port.
After some investigation, we realized that the issue was due to the fact that only the backend was connected to Nginx and the frontend was not. We changed the port opened on nginx to 433 to resolve the conflict and obtained the certificate through Cerbot. And I also added code to the Nginx configuration file to build and pre-store the frontend files in the container and connect the frontend to port 80, resolving the 404 errors.
location / {
root /var/www/frontend;
index index.html index.htm;
try_files $uri $uri/ /index.html;
}
Celery error
The team member who connected Celery and RabbitMQ faced an error when he distributed the code and tested it on other team members’ computers.
billiard.exceptions.WorkerLostError: Worker exited prematurely:
signal 11 (SIGSEGV).
We thought that the computer forcibly terminated Celery because it was using a lot of memory. We then discovered the “pool” option in the Celery command, which allows them to choose between processes and threads, and decided to use the “solo” option to minimize memory usage while only allowing one task at a time. After making this change with the below command, It worked normally on other team members’ computers.
# Previous Commands
celery -A backend worker --loglevel=info
# Changed Commands
celery -A backend worker --loglevel=info --pool=solo
Reference ☀️
- AI model
- Train Dataset
🍀Meet our Team🍀
- 강용민 : Kyoungmin1016
- 이지윤 : jiyoon0701
- 백동열 : TMInstaller
- 김유라 : yura0302
- 권찬영 :fnzl54
- 황현성 : hstla
- 이규현 : Mayreeel