Skip to content

My work as a machine learning and research intern at NTNU. AIS data clustering catch prediction

Notifications You must be signed in to change notification settings

mrlydv/prediction_ais_catch

Repository files navigation

Prediction_AIS_catch

The fishing industry is identified as an important sector accounting for 4.6% of the total Norwegian Export value. Global changes in climatic variables have impacted and continue to impact marine fish and aquaculture production, where machine learning (ML) methods are yet to be extensively used to study aquatic systems in Norway. The method proposed in this document aims to find, combine and explore relevant fishing activities data with a focus on activities in Norway and develop data-specific tools for visualization, observing accessing, forecasting, and managing fisheries. In this paper, we explore a Spatio-temporal dataset that is a combination of AIS data (i.e., information on trajectories of fishing vessels) and the corresponding fish catch (i.e., the quantity and type of fish caught). The overall data describes the fishing activities over The Norwegian Sea and The North Sea for the past two decades. The problem we try to solve is the prediction of catch given some underlying conditions. Since it’s a prediction task hence we use tree-based regression models like Random Forest, XGBoost and LightGBM. Here we have made our own custom objective and custom evaluation function for the LightGBM algorithm. The approach is to model for specific vessel groups, geographical locations and species. The study explores the relationship between the physical parameters of the vessels for each year and uses that relation for further analysis. The study reflects the dependence of catch on physical parameters of vessels like length, gross tonnage and power as well as the impact of geographic locations (latitude and longitude), species of fish targeted, tools (gears) with which fishing is done and also the time (month) in which the fishing is to be done. We have also included a feature of product condition (code) in our analysis since the total catch is the total weight of fish being delivered hence it is important to have the information beforehand about how the delivery of the fish caught is planned. For example, you can catch the same number of fish but deliver it in frozen format (ice) so in this case the total weight will also include the weight of ice, so one will get a larger weight but the actual fish weight could be the same. In this paper for the purpose of specific analysis we have considered vessels with length less than 10m (which constitute 30.2% of all vessels) and 4 different fish species codes for cod species (‘1022 - Cod’, ‘102201 – Norwegian Cod’, ‘102202- Northeast Arctic Cod’, ‘102204 – Other Cod’). Also, we have divided the model into southern and northern part based on latitudes of fishing locations in a sense that we have divided the range of latitudes into two halves, the upper half is the northern model whereas the lower half is the southern model. However, one can easily modify the codes for any different length group and different species of fishes to predict the catch they can obtain with certain error bar. The preliminary results demonstrate that results are good for southern locations and northern region still needs improvement. With the proposed approach we were able to achieve mean absolute error 47.6 Kg in average on a test set. Our predictive results are preliminary in both temporal data horizon that we are able to explore and in the limited set of learning techniques that are employed in this task. However, it does not explore fluctuations in catch caused by environmental variation or any political interference. In the rapidly warming region, it is of vital importance to understand how stocks may be further affected by climate change in addition to fishing pressure. It is likely that other centers of intense fishing activities are in possession of similar data and could use the methods similar to the ones proposed here in their local context.

About

My work as a machine learning and research intern at NTNU. AIS data clustering catch prediction

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published