Breadcrumbs
Home ›-
Strategic Science Investment Fund
-
Funded programmes
- Advanced Energy Technology platform
- Advanced Technology Platform: Future Magnetic and Materials Technologies
- Antarctic Science platform
- Crown Research Institute platforms
- Data Science platform
- Independent Research Organisation platforms
- Infectious Disease research platform
- Natural Hazards and Resilience Platform
- New Zealand Agricultural Green House Gas Research Centre
- Ngā rākau taketake – combatting kauri dieback and myrtle rust
- Ribonucleic Acid (RNA) Development platform
- Space Engineering platform
-
Funded infrastructure
- Advanced genomics research platform
- Australian Synchrotron
- Enhanced Geohazards Monitoring (National Geohazards Monitoring Centre)
- Longitudinal studies infrastructure platform
- Mission Operations Control Centre
- New Zealand eScience Infrastructure
- Nationally Significant Collections and Databases
- eResearch Infrastructure Platform
- Research Vessel Tangaroa
-
Funded programmes
Data Science platform
The Strategic Science Investment Fund (SSIF) Data Science platform, intends to significantly lift New Zealand’s capability, support and encourage dynamic and world class data science research, and deliver on the Government’s data science investment goals.
On this page I tēnei whārangi
The Government’s data science investment goals are to:
- make sure New Zealand has sufficient advanced data science capability to develop useful and transformative data science techniques
- create benefits for New Zealand.
MBIE funding
The Government is committed to investing $49 million between 2020 and 2027 in four Strategic Science Investment Fund (SSIF) Data Science research programmes.
Research programmes funded
The research programmes being funded by the SSIF platform for Data Science are:
Read the public statement
Aquaculture is New Zealand’s best opportunity to sustainably grow its Blue Economy, yet the industry is facing significant challenges in achieving its $1 billon revenue target by 2025. We now have more mussel farms, but the total annual yield hasn’t increased. Could it be due to climate change? The industry depends on natural sources of spat (mussel larvae) found in springtime off Ninety Mile Beach. Little is known about where it comes from – or how best to get spat to adhere to the growing ropes. Mussel farming relies on experience, but conditions change from farm to farm and season to season.
The aquaculture industry thinks that data science could be the answer. By bringing marine scientists together with specialists in machine learning, modelling, and data visualization, we will develop new science to support decision-making, so farm managers can respond to climate challenges, manage disease, improve production yields, and farm sustainably at scale. Applying data science to marine farming makes sense.
In this programme, we will develop innovative data science techniques that will enable the aquaculture industry to produce efficiently and at large scale, producing high-quality, low-carbon protein for New Zealand and the world without compromising the environment. To do this, we will build data science knowledge in the industry.
Māori own large aquaculture assets. With our partners Whakatōhea and the Wakatū Incorporation, we will co-design a programme to educate young Māori in data science and create the next generation of industry leaders. All our students, from undergraduate to PhD, will work with industry within the research programme and through internships and summer projects.
Aquaculture data poses immense challenges for the researchers, so our programme is led by our best data scientists, guided by distinguished international researchers. Our students will learn from the world’s best.
Read the public update from the 2024/25 annual report
Over the fifth year, we have achieved significant progress in the established 3 case study themes on shellfish and finfish, and also Vision Mātauranga, as well as the fundamental research in data science and AI. The team has 39 high-quality publications at international journals and conferences. Our team members have also been invited to present our work at different international conferences. A number of team members have been awarded best paper awards and competitions. We have also developed machine learning tools for the industry to apply our techniques such as online tools and demos for real-world users. The Science Leader/PI was given the Australasian Artificial Intelligence Distinguished Research Contribution Award in 2024, and the ACM SIGEVO Outstanding Contribution Award in 2025.
Our capacity-building is highly effective in both recruiting and training students including undergraduate students, summer/honour students, masters, PhDs, and postdocs, providing a solid framework for their academic and professional development to work on this programme. We have regularly managed the Māori scholarships/internships, including Undergraduate Scholarships/internships, Summer Scholarships, Graduate Awards, Master by Research Scholarships and PhD Scholarships, to attract, support and train our young Māori students and researchers in data science and application to aquaculture.
For impact improvement, we have delivered 14 talks/speeches, organised 7 different special sessions/issues and workshops, and published 1 newsletter to promote our programme and achievements. We have also invited 12 world-leading researchers to visit us and deliver distinguished talks and panel discussions. Our team members also play significant roles for data science and aquaculture disciplines, such as associate editors in top journals, members of advisory boards for international conferences, and conference chairs. We have been actively engaging with industry partners, and attending NZ Aquaculture and Seafood industry conferences/meetings/forums. Finally, we have organised one workshop to strength the collaborations among team members, and provided opportunities for engagements with international leaders in Aquaculture.
The research of this programme has progressed well over the past year, from both fundamental and application points of view. The research continues to focus on image analysis, regression, prediction, and modeling, and core techniques from the technology themes such as transfer learning and domain adaptation have been expanded to multi-task learning for handling multiple tasks simultaneously. The application areas have expanded from mussel farms, fish image classification, to oysters and river water flow prediction, to king salmon, and sea bird detection. There are close connections between fundamental research and the case study themes. To address a task in the application theme, novel methods need to be developed from the technology themes. For example, we have developed new methods for transferring knowledge from data to help solve other tasks, including transferring knowledge while learning and transferring from a pre-trained model. The developed methods have been used for improving buoyancy detection and multi-object (i.e., multi-buoy) detection, as well as water flow in rivers for shellfish (oyster) farm production. Neural networks, statistical learning, and genetic programming methods have also been investigated for salmon feeding efficiency and health detection.
Read the public update from the 2023/24 annual report
Over the fourth year, we have achieved significant progress in the established four case study themes on shellfish and finfish, and Vision Mātauranga, as well as the fundamental research in data science and AI. The team has 34 high-quality publications at international journals and conferences. Our team members have also been invited to present our work in different international conferences. A number of team members have been awarded best paper awards and competitions. We have also developed machine learning tools for the industry to apply our techniques such as online tools and demos for real-world users.
Our capacity-building is highly effective in both recruiting and training students including undergraduate students, summer/honour students, masters, and PhDs, providing a solid framework for their academic and professional development to work on this programme. We have regularly managed the Māori scholarships/internships, including Undergraduate Scholarships/internships, Summer scholarships, Graduate Awards, Master by Research Scholarships and PhD Scholarships, to attract, support and train our young Māori students and researchers in data science and application to aquaculture. We have offered Māori undergraduate Scholarships to three students, and one of them is currently working with us as a summer student.
For impact improvement, we have delivered 17 talks/speeches, organised 6 different special sessions/issues and workshops, and published 2 newsletters to promote our programme and achievements. We have also invited 9 world-leading researchers to visit us, and to deliver distinguished talks and panel discussions. Our team members are also in significant roles for data science and aquaculture disciplines, such as associate editors in top journals, members of advisory boards for international conferences, and conference chairs. We have been actively engaging with industry partners, and attending NZ Aquaculture and Seafood industry conferences/meetings/forums. Finally, we have organised one workshop to strength the collaborations among team members, and provided opportunities for engagements with international leaders in Aquaculture).
Our work generated by this programme has been recognised by a professional UK magazine WORLDFISHING & AQUACULTURE, details here. One of our papers accepted by Journal of the Royal Society of New Zealand has been recognised and posted by Australian Society for Fish Biology, i.e., a professional organisation of fish and fisheries researchers, via linkedin.
Read the public update from the 2022/23 annual report
Over the third year, we have achieved significant progress in the four case study themes on shellfish production, finfish breeding, fish health prediction, and Vision Mātauranga, as well as the fundamental research in data science and AI.
In terms of research, each case study has been progressing well based on the collaborations between experts from both the aquaculture and data science sides. New aquaculture related data and applications are also explored and added to the case studies, greatly enhancing the case studies. The team has a good number of high-quality publications in international journals and conferences. Our team members have also been invited to present our work in different international conferences, workshops and summer schools via plenary and keynote talks, specialised tutorials, best paper awards and competitions.
We have successfully built a pipeline with Māori scholarships/internships, including Undergraduate Scholarships/internships, Summer scholarships, Graduate Awards, Master by Research Scholarships and PhD Scholarships, to attract, support and train our young Māori students and researchers in data science and application to aquaculture. We have successfully offered Māori Undergraduate Scholarships to three Māori students, and expect to offer more soon.
To improve our impact, we have delivered different kinds of talks/speeches, organised different special sessions/issues and workshops, and published newsletters to promote our programme and achievements. We have also invited world-leading researchers to visit us, and invited them to deliver distinguished talks and panel discussions. Our team members also play leading and important roles for data science and aquaculture disciplines, such as associate editors in top journals, and membership of advisory boards for international conferences. We have been actively engaging with industry partners, and attending NZ Aquaculture and Seafood industry conferences/meetings/forums. Finally, our Te Whiri Kawe—Centre for Data Science and AI (https://www.wgtn.ac.nz/cdsai(external link)) was successfully launched by Minister Dr Ayesha Verrall in June 2023.
Launch of Te Whiri Kawe—Centre for Data Science and Artificial Intelligence(external link) — Victoria University of Wellington
Read the public update from the 2021/22 annual report
The goal of this programme is to help the decision marking in the aquaculture industry by bringing marine and aquaculture scientists together with specialists in machine learning, modelling, and data visualisation to develop new science. In this programme, we aim to develop innovative data science techniques that will enable the aquaculture industry to produce efficiently and at a large scale, producing high-quality, low-carbon protein for NZ and the world without compromising the environment.
To do this, we will build data science knowledge in the industry. Māori own large aquaculture assets. With our partners Whakatōhea and the Wakatū Incorporation, we will co-design a programme to educate young Māori in data science and create the next generation of industry leaders. All our students, from undergraduate to PhD, will work with industry within the research programme and through undergraduate internships, postgraduate scholarships and summer projects.
Over the past year, we have made significant achievements in different aspects. We have established four application case study themes which focus on shellfish production, finfish breeding, fish performance and health prediction, and Vision Mātauranga, respectively. Each theme is run by experts from both industry and academia. We have published 59 papers including fundamental research in data science and applications to aquaculture. To improve our impact, we have delivered 19 talks/presentations, organised 21 special sessions/issues and workshops, and published 1 newsletter to make our achievements more visible. We have also started building a pipeline which provides scholarships and internships to help young Māori in data science. In addition, we have been collaborating with top international researchers to carry out research, and engaged with world-leading researchers by inviting talks and discussions. More details can be found on our programme website:
Progress Report 2022 - Groups/DataScienceForAquaculture | ECS(external link) — Victoria University of Wellington
More information:
Data science for aquaculture(external link) — Victoria University of Wellington
Read the public statement
Led by Te Hiku Media, our research will lead the revitalisation of minority and indigenous languages and the indigenisation of digital devices worldwide. Over the next seven years, our research programme will bring together world-leading data scientists from New Zealand, Cambridge and Oxford Universities, Māori communities and Mozilla in a unique collaboration to tackle this challenge.
Our proposal aims to establish a multilingual language platform to develop natural language processing tools and methods that will enable New Zealanders to engage with technology in the language they use or aspire to use every day. Starting with te reo Māori and New Zealand English, our program will ensure a New Zealand identity is firmly embedded in the digital world. We will also extend into the Pacific and work with Samoan and Hawaiian communities. Our tools will make it possible to switch between these languages, so people can speak into their devices to “find a choice as kai of panipopo”. Most importantly, however, we will secure the future for these languages in a changing, dynamic, digital world.
Currently, minority languages do not have the datasets big enough for existing methods to work, so these languages and their communities are largely invisible and unheard in these contexts. Their existence is under threat because as digital technologies further permeate our day to day life, the ability to engage and transmit the language intergenerationally becomes more and more difficult. Our research programme will make it possible for ‘low resource’ languages and their speakers to be able to fully participate in a digital context by creating cutting edge technology.
Read the public update from the 2024/25 annual report
"Kei tona kainga, kei tona whare anake te poropiti hapa ai i te honore" is a phrase from the Māori bible. It translates to mean that at home, prophets are not honoured for their work. This has come to be used for leaders who face challenges when pushing for change at home, and that recognition comes later, once others from outside the community acknowledge the value of their work. For the Papa Reo project this year, the phrase highlights that while we have gained major international recognition and have received many accolades, at home, our value is found in our hard work and the small gains we make in our research and the many contributions we make to the community.
The momentum of refining our bilingual automatic speech recognition tool, including our data curation and benchmarking practices, alongside our stance on data sovereignty and ethical AI, was recognised as world-leading and has made an impact on other Indigenous language communities. There have been several highlights this year that demonstrate the global impact of Papa Reo. Firstly, our project leader was recognised in the TIME100 Most Influential People in AI 2024. Peter-Lucas Jones stated in his profile for the list: “In the digital world, data is like land,” he says. “If we do not have control, governance, and ongoing guardianship of our data as indigenous people, we will be landless in the digital world, too.” This resonated with many Indigenous communities around the world and has been a powerful rallying call. This saw invitations to present at the World Economic Forum and the United Nations, as well as indigenous research events. Our team had the first Indigenous peer-reviewed poster paper accepted at NeurIPS, one of the premier conferences in the field of machine learning and artificial intelligence. Furthermore, the work we do is making significant contributions to global discussions on intellectual property rights, Indigenous data sovereignty, and appears in international research about AI in education, business and governance.
Another international highlight was the launch of Lauleo. This was the culmination of many years of relationship-building and building trust and understanding about the aspirations and needs of the Hawaiian language community. With the support of our partners, Lauleo successfully gathered over 413 hours of 'ōlelo Hawai'i from over 1000 voices. This corpus can now be used to develop impactful NLP tools for 'ōlelo Hawai'i.
At home, Papa Reo continues to improve our existing models, ensuring they are market leaders, outperforming the likes of OpenAI and Meta. We continue to grow capacity and capability in data science for Aotearoa, with two recent Māori graduates joining our team as Junior data scientists, both of whom have been making an impact with their work supporting 'ōlelo Hawai'i and te reo Māori. Papa Reo has co-funded research, including that of Himashi Rathnayake, whose research is helping develop the world’s first speech emotion recognition system designed specifically for te reo Māori. Our tools continue to support academic research and contribute to the health and education sectors. Papa Reo is on track to be the multilingual platform we aspired to at the beginning and is making an impact on the global stage and at home.
Indigenous data sovereignty in intangible cultural heritage governance: A complementary approach to public–private partnerships(external link) — cambridge.org
He Waka Eke Noa: Navigating AI Futures with Aboriginal and Māori Knowledge(external link) — Center for Open Science (OSF)
Read the public update from the 2023/24 annual report
Over the fourth year, we have achieved significant progress in the "Ehara te toka i Akiha, he toka pakupaku, he toka whitianga-ā-rā; ka pā tāu ko te toka o Mapuna, tēnā tāu e titiro ai ko te ripo kau." This whakataukī describes two approaches to the research and work undertaken by Papa Reo this year. The first, like Akiha, is visible and all of its qualities can be seen. The second, like Māpuna, is hidden below the surface and is only noticed by the ripples and currents that swirl around it.
This year, Papa Reo had, like Akiha, moments of very public recognition for the work undertaken by the team. The Rise 25 award to the Chief Technology Officer of the project, Keoni Mahelona, was a global recognition of the efforts to build a language platform for under-resourced languages and maintain sovereignty over data. Invitations and subsequent presentations to the Creative Commons Summit, National Digital Forum and the ADA Copyright Forum, for example, are all moments where the project, the project goals and the team's efforts have all had a moment in the sun.
Like Māpuna, however, much of what we do occurs below the surface and is often incremental steps to advancing the technological goals of the project. Our models are continually validated and benchmarked, which is a vital step to maintaining integrity and legitimacy in the AI community. Our models outperform efforts by big tech to deliver NLP tools for te reo Māori and we believe that this is as a result of our approach, our relationships and the work we conduct below the surface.
Read the public update from the 2022/23 annual report
"Kua tawhiti kē tō haerenga mai kia kore e haere tonu. He nui rawa ō mahi kia kore e mahi tonu. You have come too far not to go further; you have done too much not to do more." (Tā Hemi Henare).
Having successfully built a strong research team and a set of foundation tools for te reo Māori, Te Hiku Media and the Papa Reo project focussed on maximising the impact of the tools and their use and relevance to Aotearoa. This meant tackling the bilingual research problem, ensuring the speech recognition tools will accurately transcribe te reo Māori and NZ English. The new bilingual model was released in early 2023 and when benchmarked against models released by the likes of Meta and OpenAI, is at least 50% more accurate for both languages and performs better across subsets such as contemporary speakers of NZ English speakers and archival native speakers of te reo Māori.
As Big Tech launched Large Language Models like ChatGPT into the market with abandon and little care for the consequences, Te Hiku Media have advanced its advocacy for ethical approaches to data collection, training of machine learning models and indigenous data sovereignty. This has seen global recognition with members of the team invited to participate in discussions hosted by Stanford Institute for Human-Centered Artificial Intelligence, the Computing Research Association of America and become members of the Partnership in AI Task Force for Inclusive AI. On the home front, Te Hiku Media continued to make impact in the Māori language community by getting tools such as Rongo and Kaituhi into the hands of users, further claiming technological landscape for te reo Māori.
Read the public update from the 2021/22 annual report
Iti pioke nō Rangaunu, he au tōna. Small as the dog shark of Rangaunu may be, great is its wake.”
The second year of Papa Reo has been about building the momentum of the project. Despite being a small team, the impact of Papa Reo has been felt extensively across the language technology and the indigenous data sovereignty spaces. Papa Reo worked closely with University-based research teams providing access to tools and data. We also supported undergraduate internships and postgraduate research projects.
Papa Reo also invested in growing the capacity and capability of our team and in our tools. Kaitāia now houses one of the fastest AI servers in the country. Ōrongonui, named for the moon phase when it was turned on, means the team has been able to experiment and innovate without constraint. Our new team of data scientists, developers and engineers can now experiment with novel approaches quickly, creating and improving our tools.
With this state-of-the-art infrastructure, Papa Reo continues to develop a multilingual language platform. Papa Reo deployed a new Māori text-to-speech model. We launched the Rongo app on Apple using a model that provides real-time feedback to improve pronunciation. The first Māori speech-to-text model developed by Te Hiku Media continues to be refined with a reduced word error rate while we prepare to train a new model using novel approaches. With these models in place, over the coming years, we will reduce the work required for other under-resourced languages.
We remain an active voice in the recognition of indigenous data sovereignty, contributing to global conversations, for example at APEC, and on the ground in national forums. The growing awareness of Te Hiku Media’s position on data sovereignty continues to create opportunities to educate and drive our decisions when choosing tools and partners.
Read more:
A language platform for a multilingual Aotearoa(external link) — Te Hiku Media
Read the public statement
Data are essential to research, understand, set policy for and manage New Zealand’s environment, but environmental data presents many challenges that require new data science methods to overcome them, and a substantial increase in the capability of environmental researchers, governors and managers to use data science in their work. This programme will develop those new methods and build the required capability.
In particular, we will focus on developing methods to deal with environmental datasets that are collected in large volumes over time and must therefore be dealt with as streams that are analysed incrementally, as they are measured, rather than as collections of data that can be analysed all at once. These methods will address underlying characteristics of the data that evolve over time (e.g. due to climatic or ecological changes), and data that are collected at a range of time intervals and spatial scales ranging from broadscale satellite images to singlepoint measurements on the ground, in the water or air. The methods we develop will be interpretable and explainable (to help users understand why an algorithm produces some particular output), identify and understand anomalies (to distinguish 'normal' from 'unusual' measurements) and quantify uncertainty in algorithm output (to help decision-makers understand how confident they can be in conclusions drawn from the data science methods).
To deliver the methods we develop in a form that environmental scientists and managers can use, we will build a new open source framework to do machine learning on time series data, and provide an open access repository of environmental datasets to improve reproducibility in environmental data science. Through workshops, undergraduate and postgraduate research projects within the programme, we will build New Zealand’s capability in fundamental and applied data science relevant to environmental data, from introductory to postdoctoral level.
Read the public update from the 2024/25 annual report
TAIAO continues to grow as a leading hub for environmental data science in Aotearoa New Zealand. Our community platform has expanded to over 400 registered users, and more than 1,100 followers across all our platforms, reflecting a strong and engaged network committed to open-source collaboration, data sovereignty, and Vision Mātauranga. Over the past year, we shared new datasets, tutorials, and software, fostering capacity-building and supporting researchers, students, and practitioners nationwide.
Our research team advanced methods in machine learning for data streams, anomalies, and deep learning, with several papers published at top international venues. CapyMOA, our open-source machine learning library, gained global recognition and widespread uptake. We also celebrated the completion of several PhDs, contributing new knowledge in areas such as flood prediction, climate anomalies, and water quality monitoring. In addition, we began exploring the potential of quantum machine learning, opening up opportunities for the next generation of environmental data science methods.
We actively supported several exciting networking and conference initiatives. In addition to hosting the Environmental Data Science and AI Summit at Victoria University in August 2024, we are making our contributions globally known through the discovery session at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) at Vilnius in September 2024. We also continue to support Indigidata Aotearoa, an initiative focused on developing an understanding of Indigenous data science and sovereignty, specifically tailored for Māori participants. We were proud to continue supporting the AI Hackathon Festival for the third year running and hosting the hackathon event for the Waikato regions. It was heartening to see teams competing with a passion to improve societal outcomes for Aotearoa New Zealand.
indigidata aotearoa(external link)
Case studies continued to drive impact, with progress in flood prediction, forest monitoring, under-sea habitat annotation, kiwi conservation, and species classification. These efforts demonstrate our commitment to developing practical, world-leading AI solutions that help address pressing environmental challenges.
Read the public update from the 2023/24 annual report
TAIAO’s vision is to advance data science by providing robust, accessible tools and methods for New Zealand's environmental sectors. Our community platform upholds data sovereignty and open-source principles, supporting Vision Mātauranga and fostering growth across the data, environmental science, and software development communities. In line with these goals, we launched CapyMOA in March 2024, an advanced machine learning library specifically designed for data streams. CapyMOA significantly improves real-time data processing, offering faster, scalable, and more accessible machine learning for continuous data streams. Additionally, we have published a range of work aligned with our research aims in peer-reviewed journals and conference proceedings over the past year.
We actively supported a number of exciting networking and conference initiatives. In addition to hosting our annual TAIAO Workshop in August 2023, we contributed to the 21st Australasian Data Mining Conference (AusDM 2023) and the AI Researchers Association Annual Conference 2024, where we showcased our research and engaged with the broader data science community. We also supported Indigidata Aotearoa, an initiative focused on developing an understanding of Indigenous data science and sovereignty, specifically tailored for Māori participants. We were proud to support the AI for the Environment Hackathon Festival last year and this year, we are again hosting the hackathon event for the Waikato and Canterbury regions. It was heartening to see teams competing with such passion for improving environmental outcomes for Aotearoa New Zealanders.
In addition to the three existing case studies, we have begun developing a new case study in collaboration with Sanctuary Mountain Maungatautari. The goal is to create a flexible and extensible annotation platform designed to address the complex challenges of annotating endangered kiwis. This platform aims to be a transformative solution for the detailed analysis of photographic and video imagery.
Read the public update from the 2022/23 annual report
The vision of TAIAO continues to be to enable the next level of data science to provide robust and fit-for-purpose tools and methods that are accessible and useful to researchers and practitioners across all areas of the New Zealand environment. In the third year of our project, we've achieved significant milestones, including the continued expansion of the community formed during the initial two years. Additionally, our team has gained deeper insights into the needs of potential end-users within Aotearoa New Zealand.
The TAIAO community platform (taiao.ai) received a new look and feel that was launched in November 2022. With this fresh launch, we introduced “Categories”, which facilitate filtering of Datasets, Notebooks, Software, and Tutorials. Aligned with the essence of TAIAO, the platform maintains its status as both data sovereign and open source. These foundational principles are instrumental in upholding our dedication to Vision Mātauranga and in cultivating not only the data and environmental science communities but also the software development community.
In addition to the two existing case studies, we have undertaken the development of a new case study that involves a collaboration with the Department of Conservation, with the goal of creating a flexible and extensible annotation platform. This platform aims to tackle the intricate challenges and intricacies associated with annotating under-sea habitats.
The TAIAO Machine Learning Course for Flood Practitioners took place in June 2023 at the University of Waikato, drawing attendees from the Waikato Regional Council, including flood practitioners, hydrologists, and environmental scientists. This event has generated interest from various other regional councils for potential future editions of the course. Efforts have been initiated to encourage greater engagement on the TAIAO community platform. In line with this, we are planning to establish a new dedicated category within the platform that aligns with this objective.
Read the public update from the 2021/22 annual report
The vision of TAIAO continues to be to enable the next level of data science to provide robust and fit-for-purpose tools and methods that are accessible and useful to researchers and practitioners across all areas of the New Zealand environment. Achievements for year two are continued strong growth of the community established during the first year, and increased team understanding of the needs of potential end-users in Aotearoa New Zealand.
The TAIAO community platform (taiao.ai) has continued its iterative improvement, culminating in a public launch in November 2021. Feedback from the environmental data science community after the launch was strongly supportive of our intent to develop the TAIAO community through the sharing of datasets, notebooks, kōrero and resources. True to the intent of TAIAO, the platform will remain a data sovereign and open source. These are two fundamentals which drive both the commitment to vision matauranga and the ability to build both the data and environmental science communities alongside the software development community.
We have been working on two case studies for the TAIAO data platform based on a data mesh architecture. For the first case study, in collaboration with the Waikato Regional Council and MetService, we created an operational live archive and API for MetService's rain radar data. The archive has been populated with all the historical surveillance radar scans from the Auckland and the Bay of Plenty radars and is progressively being backfilled with data from the other New Zealand rain radars.
The second case study is ongoing work with SCION, which takes advantage of data from the ForestFlows system, which monitors forests in real-time. This work will help to increase the information flow to interested entities regarding the forests of New Zealand.
Read more:
Machine Learning for Streams(external link) — Time-Evolving Data Science and Artificial Intelligence for Advanced Open Environmental Science (TAIAO)
Read the public statement
Data science facilitates new approaches to longstanding problems in healthcare, policy, ecology and economy. But to make the most effective use of it, we need analytical methods that are straightforward to apply, open to review and audit, and produce results that can be correctly interpreted by practicing researchers and policymakers. Furthermore, we need methods that discover, gather and integrate potentially useful data with minimal human intervention, and ensure that the most suitable analysis methods are used with such data. Finally we need to empower a whole generation of researchers, across all fields, to use these new methods in robust and defensible ways.
Our team comprises researchers from the Universities of Auckland, Otago, Canterbury and Massey. In it, computer scientists and statisticians will work alongside domain scientists in fields such as computational biology, ecology and public health. Over the next 7 years, we will improve the application of data science methods in complex research settings, make processing more efficient, and create transparent and computationally-reproducible workflows that are published, open and easily reused. We will commit the majority of our budget to training and equipping the doctoral and post-doctoral researchers who will go on to successfully apply data science methods to making improvements to our environment, economy and society.
Read the public update from the 2024/25 annual report
Our overall aim is to develop new data science methods that help to improve the process of conducting and reporting research, and to use these developments to improve our national capability in genomics and population health and ecology research, supporting our bio-economy and at the same time improving health and environmental outcomes.
The team has achieved significant research progress this year in our 3 distinct areas:
- bioinformatics - studying the relationships between genomics and disease to help improve our response to agricultural disease outbreaks and human health outcomes;
- ecological modelling - creating new models to study the entire genome of threatened and invasive species to assess ecological impact and remediation;
- live and transparent data science - creating new ways of publishing and updating science studies to keep them up to date as new data and methods emerge, reducing the time that researchers spend updating, re-running and reporting their findings. In each of these areas we have developed new methods and related software that has been shared openly with the science community.
Collaborations within our team are now leading to additional grant proposals in the areas of AI and genomics, and we have strengthened our collaborations with other research partners based in Aotearoa: notably Public Health Science (formerly EHR), Genomics Aotearoa, the RNA Platform, DoC, MPI, MoH and overseas: The Allen AI institute and Argonne National Labs - USA, Federal Institute for Risk Assessment (BfR) and the Robert Koch Institute - Germany, Institute of Zoology - UK, and ETH -Switzerland).
This year our outreach & training event - ResBaz 2025 - provided hands-on data science training to over 2300 individual researchers throughout the country, that they can apply in their own research. We are also training a future generation of NZ data scientists via our summer placements, PhD scholarships and post-doctoral appointments.
Read the public update from the 2023/24 annual report
Our overall aim is to develop new data science methods that help to improve the process of conducting and reporting research, and to use developments in data science to improve our national capability in genomics and biology research, thus helping New Zealand to remain competitive in these two important areas.
One of our major project goals is to help make research more transparent and trustable. Over the past year, we have developed a scientific claim verification methodology and associated tool allows questions to be posed about the accuracy and supporting evidence for any claim made against the scientific literature. An associated web application allows claims to be evaluated in real time against established scientific literature. This kind of tool helps to empower non-scientists to ask specific questions about scientific findings and evidence, and to explore for themselves the peer-reviewed evidence that supports, and refutes, any science claim.
Another major achievement over the past year is the use of Digital Twin technology to create a genomic digital twin of a threatened native species, the Hihi (stitchbird). This digital twin allows us to model with precision how the genomics of the current Hihi population will affect its ability to survive and thrive in a changing world. Such methods allow us to better understand how our precious flora and fauna will react in the future to changes in their environment, so we can understand associated risks before changes occur and plan for better mitigation.
Collaborations within our team are now leading to additional grant proposals in the areas of AI and genomics, and we have strengthened our collaborations with other research partners based in Aotearoa (notably ESR and Genomics Aotearoa) along with several prestigious overseas collaborations, including the Allen AI Institute and Globus Labs.
We sponsored and helped to deliver a significant national outreach & training event (ResBaz 2024) enabling over 1000 researchers to learn new data science skills that they can apply to their own research.
Event schedule(external link) — ResBaz Aotearoa
Read the public update from the 2022/23 annual report
The team has achieved significant research progress this year in four distinct areas: (i) phylogenetic analysis—studying the relationships between genomics and disease using advanced data science methods, (ii) ecological modelling—contrasting geographical and genomic differences within endangered species and (iii) live science publishing—creating a new way of publishing science experiments embedded within traditional research articles that are now getting traction with other researchers and (iv) AI tools that can validate or refute scientific claim--for example: "Covid vaccines cause sterility in males". In each of these areas we have developed new methods and related software that is being shared openly with the science community.
Collaborations within our team are now leading to additional grant funding in the areas of AI and genomics, and we have strengthened our collaborations with other research partners based in Aotearoa (notably ESR and Genomics Aotearoa with plans to extend to the national RNA Platform once it is underway).
We sponsored a significant national outreach & training event (ResBaz 2022: https://resbaz.auckland.ac.nz/sessions/(external link)) enabling several hundred applied researchers throughout the country to learn new data science skills that they can apply in their own research. We are also now training a future generation of NZ data scientists via our PhD and post-doctoral scholarships and via related AI Carpentry work.
Taken together, this progress is producing the new methods, code, collaborations and workforce that will ensure Aotearoa can take better advantage of data science in its future research endeavours in key industries such as genomics and AI and branches of government such as public health and debate on science and policy.
For more information, contact Prof. Mark Gahegan, Professor of Computer Science at the University of Auckland, email: m.gahegan@auckland.ac.nz
Read the public update from the 2021/22 annual report
The team has achieved significant research progress this year in three distinct areas: (i) phylogenetic analysis—studying the relationships between genomics and disease, (ii) ecological modelling—contrasting geographical and genomic differences within endangered species and (iii) live science publishing—creating a new way of publishing science experiments embedded within traditional research articles. In each of these areas we have developed new methods and related software that has been shared openly with the science community.
Collaborations within our team are now leading to additional grant proposals in the areas of AI and genomics, and we have strengthened our collaborations with other research partners based in Aotearoa (notably ESR and Genomics Aotearoa).
We sponsored a significant national outreach & training event (ResBaz 2021: https://resbaz.auckland.ac.nz/sessions/(external link)) enabling hundreds of applied researchers throughout the country to learn new data science skills that they can apply in their own research. We are also now training a future generation of NZ data scientists via our PhD and post-doctoral scholarships.
Taken together, this progress is producing the new methods, code, collaborations and workforce that will ensure Aotearoa can take full advantage of data science in its future research endeavours in key industries and branches of government.
For more information, contact Prof. Mark Gahegan, Professor of Computer Science at the University of Auckland, email: m.gahegan@auckland.ac.nz