Dr Brett Drury, from SciCrop is awarded by the Royal Academy of Engineers

We are pleased to announce that our Head of Research, Dr Brett Drury, has won the prestigious Leaders in Innovation Fellowship: https://www.raeng.org.uk/grants-and-prizes/international-research-and-collaborations/newton-fund-programmes/leaders-in-innovation-fellowships, which is awarded by the Royal Academy of Engineers. This fellowship will see Brett attending an intensive course hosted in London which is intended to assist Brett in the international commercialization of his research.

Working with a select number of partner countries, the primary objective of the Royal Academy of Engineering’s Leaders in Innovation Fellowships (LIF) programme is to build the capacity of researchers for entrepreneurship and commercialisation of their research. More broadly, the programme creates international networks of innovators and technology entrepreneurs.


As the UK’s national academy for engineering and technology, the Royal Academy of Engineering brings together the most talented and successful engineers to advance and promote excellence in engineering for the benefit of society (https://www.raeng.org.uk/about-us/what-we-do).

Posted in Uncategorized | Tagged | Leave a comment

International Workshop on Applications of AI in Security

Along with colleagues from Israel, Brazil and Ireland I am organizing a workshop at ECML in Dublin on the Applications of AI in Security. If you are are researcher in this field, especially in the area of food security, please think of submitting a paper, or attending. The workshop is both for industry and academic researchers and practitioners. Practical system demonstrations are particularly welcome. More details can be found at: http://iwaise2018.it.nuigalway.ie

Posted in Uncategorized | Leave a comment

Dynamic Bayesian Networks from text that represent a specific crop

One of the probabilistic/data science projects inside SciCrop, conducted by our Head of Research, Brett Drury, aims to build Dynamic Bayesian Networks (DBN) from text that represent a specific crop in Brazil. These networks can be used to estimate the global production of that crop in Brazil. In this post we will explain the main concepts of how we do that.


Text is a good medium to extract knowledge as it contains the opinions and knowledge, from a large number of people. The information that a large body of text will hold exceeds any individual or group of people understanding of the area. The network is build by extracting causal relations from text, creating a directed graph, removing inconsistencies, and irrelevant information, and then convert the graph to a DBN.


The extraction of causal relations, and reasoning with large DBNs are very computationally intensive tasks, and there are no tricks to speed up the processing, consequentially we rely upon High Performance Computing (HPC) techniques to ensure that the process runs in the shortest period of time without losing any expressiveness of the extracted data.

 

Causal relations are relatively sparse, and we are extracting them with LSTM+CRF (Long Short-Term Memory with a Conditional Random Field) combination with a Word2Vec word vectors with 600 classes. This outperforms the previous benchmarks of using only a CRF with hand-crafted features. However to extract a sufficient number of causal relations a large amount of text has to be parsed. In our current experiments we are parsing 36K documents. In production we will parse a much larger corpus in the order of a million documents.  


The problem is embarrassing parallelizable where a causal relation extractor can be run on an individual node. With multiple nodes the documents processed by each causal relation extractor will less than the whole corpus. In an extreme example if we have 36K nodes we can process the aforementioned corpus almost instantaneously. Using cloud services like Google Cloud (in our case) allows the almost infinite expansion  of computing resources. An overnight task, becomes computable in the time it takes to make a coffee.


The construction and editing of graphs can be achieved on a single machine, however performing inference on large graphs cannot. Larger graphs have more expressive power and therefore better predictions, but require more computational power. Typically with single machines we reduce the number of nodes by merging them with others in a graph until we have a graph that is small enough to process. An alternative is to use HPC, and spread the reasoning workload across multiple nodes. At SciCrop we use the following stack:

 

 

This stack allows us to reason with very large graphs. We can retain the expressive power of large graphs, but with the short inference times of smaller graphs.


This project allows us to create expert models for any crop and encode this information into a directed graph. We can use this graph to reason about the market for that crop, and with data from a farm, we can reason the farm’s crop production. The predictive ability of the graph improves with high quality textual information about crops being collected daily in a high volume.

The causal relation extractor is a LSTM Neural Network with a Conditional Random Field. This was built with Theano and Keras. The reasoning uses amidst, Apache Spark, and Apache Hadoop to distribute the workload to multiple nodes. This allows the construction of massive and detailed graphs. The larger and more detailed a graph is, the better its predictive ability.

With this approach we are able to create expert models for any crop and encode this information into a graph . We can use this graph to reason about the market for that crop, once you have a real huge database with high quality textual information about crops being collected daily in a high volume.

Posted in research | Tagged , , , , , , , , | Leave a comment

Data Science and Agriculture

This new Scicrop blog is intended to introduce data science concepts to the agricultural audience. Data science is an important addition to the farmers’ armory to improve yields, reduce costs and increase profitability. Especially in the era of global warming where advances from traditional sources are expected to level off around 2050. Plateauing food production will be a global disaster because of increasing population, and the rise of popularity in gasoline alternatives such as sugar ethanol which will make demands upon agricultural lands.

Data science allows farmers and farm managers to make optimal decisions about many farm processes. This can lead to large gains which can be through the aggregation of small gains through out the farm or through the identification of poor decision making. Data science gives the farmer not only evidence for their own intuitions, but provide insights for decisions that may seem counter intuitive to even the most experienced farm manager.

Data science is simply a mix of techniques from computer and information science as well mathematics and statistics. These techniques are used to analyze data that has been collected from variety of sources which may be: open or proprietary , local or global. Inferences then can be made, and these insights are turned into actions and decisions. There is nothing mystical about the techniques used, and despite the best marketing efforts of a number of companies the techniques are open and available to all. The “secret sauce” is the method selection, its application and access to data. A haphazard approach to data science can produce disastrous results.

In the coming weeks and months this blog will address issues in data science and agriculture, and if you have any questions about what is covered in this blog, please contact us at: info@scicrop.com

Posted in Uncategorized | Leave a comment

A Survey of the Applications of Bayesian Networks in Agriculture

In a new paper, Brett Drury, Head of Research at SciCrop describres the importance of Bayesian Networks in agriculture, despite the fact the there is a lack of implementation of this statistical/graphical model, as an important machine learning approach.

Here you can read the abstract:

The application of machine learning to agriculture is currently experiencing a ”surge of interest” from the academic community as well as practitioners from industry. This increased attention has produced a number of differing approaches that use varying machine learning frameworks. It is arguable that Bayesian Networks are particularly suited to agricultural research due to their ability to reason with incomplete information and incorporate new information. Bayesian Networks are currently underrepresented in the machine learning applied to agriculture research literature, and to date there are no survey papers that currently centralize the state of the art. The aim of this paper is rectify the lack of a survey paper in this area by providing a self-contained resource that will: centralize the current state of the art, document the historical progression of Bayesian Networks in agriculture and indicate possible future lines of research as well as providing an introduction to Bayesian Networks for researchers who are new to the area.

Here you can get access to the full paper.

Posted in research | Tagged , , , | Leave a comment

Welcome to the new Research Blog of SciCrop

Hi, I am happy to announce that SciCrop has a new place to publish scientific information regarding the research domains related with our services and products.

This initiative is an idea from our Head of Research, Brett Drury, who will also keep this blog updated. Beyond his career in the software industry, Brett has a prolific academic work, in artificial intelligence and data science, most of his publications can be seen is his Research Gate profile as well as in his Google Scholar profile.

Posted in research | Tagged , , , , | Leave a comment

SciCrop releases source code of software that facilitates and automates downloading of satellite images from the Copernicus constellation

Today SciCrop launches the new release of the SciCrop-Sentinel-Extractor project. For those who do not know, this project consists of an autonomous, distributed and multithreaded locator and extractor of images from the Sentinel satellites of the Copernicus space program (http://www.copernicus.eu/main/sentinels).

This application is built upon services consumption from the API  offered by the European Space Agency’s (ESA) In-Orbit Commissioning Review (IOCR).

SciCrop-Sentinel-Extractor solves a problem common to all researchers using the images provided by the Copernicus program: Retrieve images automatically and with verification and correction of errors, without human intervention.

Launching this application and source code, reinforces SciCrop’s commitment to contributing to the scientific research ecosystem in the field of Earth observation and remote monitoring of vegetation.

To access the project repository, use this url: https://github.com/Scicrop/sentinel-extractor

Posted in open source | Tagged , , , | Leave a comment