Use and outputs
Having humans-in-the-loop, and understanding how to use artificial intelligence systems effectively and securely, is key to responsible artificial intelligence use and deployment.
On this page
Artificial intelligence (AI) tools can be helpful, but they don’t always get things right. That’s why it’s important to think carefully before using AI results to make big decisions – especially ones that affect people’s lives. AI should support the work people do, not replace their judgement. It’s best when humans and AI work together, with people staying responsible for the final choices.
GenAI user inputs
* GenAI tools work by responding to prompts – the questions or instructions you give them. The information included in any prompt (for example the images you provide, or text you type into an LLM) can affect what the tool gives back, and sometimes even how it learns and responds over time.
- * Some GenAI tools, particularly those that are publicly available or free of charge, may share prompt data (including attachments) with external parties (including developers or other users). This means information provided through prompts could be exposed in future through a security breach or the model itself.
- * Even if your prompts or attachments don’t include names or are otherwise anonymised or depersonalised, personal, sensitive or classified information could be joined up with other information over time to re-identify someone.
Tips
Advice for prompting
- Be specific and prescriptive in queries, including in regards to relevant context and the desired tone or style.
- If using a large language model (LLM), consider asking for step-by-step instructions to help with understanding the response and pinpoint if and where errors are being made.
- Provide examples of what you are looking for to guide the type and structure of response.
- Include instructions to be clear if/when the model is unsure of a response.
- Ask for information sources and references so you can verify output accuracy.
Use of GenAI outputs
* GenAI can be used in a variety of ways including to improve customer experience, for marketing and content creation, and for productivity gains. However, there are some potential risks when it comes to GenAI outputs and their use:
- * LLMs are designed to generate “statistically probable language patterns”, which means different (though often similar) answers are likely to be given to the same prompt. Businesses should therefore not rely on any specific answer being produced in response to a specific prompt.
- * LLMs may be susceptible to errors, omissions, training bias and factual inaccuracy. While responses can seem well developed and credible, they may confidently present opinion as fact, skip important details depending on the prompt given, and in some instances fabricate information as truth. This tendency is referred to as ‘hallucination’.
- * Depending on the training data and prompt used, outputs may also reflect prevalent, often biased, out-of-date or unethical viewpoints. These tendencies could impact Māori and other Indigenous, minority, or otherwise disproportionately disadvantaged communities (including women, older people, young people and children, disabled people) disproportionately.
- * Some GenAI tools that generate realistic images or videos of a person or their voice, can be used maliciously. These are called ‘deepfakes’ – realistic but fabricated audio, video, or images that convincingly mimic real individuals. Malicious actors can use them to spread misinformation, defame individuals, or enact scams and exploitation. The technology has also been misused to create explicit content.
- * GenAI can be, and is often used, to generate new and novel content and material. However, outputs may lack commercial protection or ownership (including given issues assessing level of ‘human authorship’ and establishing originality). There is also always a risk that model outputs may (intentionally or inadvertently) be substantially similar to existing copyright works (whether or not those works were used to train the model) and therefore be potentially subject to plagiarism and copyright infringement claims.
- * GenAI has been used to replicate components of Māori culture and traditional knowledge without appropriate permission and attribution. New Zealand’s intellectual property laws include provisions for the protection of mātauranga Māori, in relation to patents and trade marks. These provisions help prevent the registration of trade marks or granting of patents that would be considered offensive by Māori or contrary to Māori values. Te Puni Kōkiri (the Ministry of Māori Development) is working to address regulatory barriers to commercialisation of mātauranga Māori (traditional knowledge).
Tips
Scenario C: ChoiceConsulting
ChoiceConsulting is a small independent policy consultancy in New Zealand that has started using generative AI to increase efficiency in producing background briefs, literature scans, and contextual overviews for clients in the public and private sectors. The team finds that the AI tool significantly speeds up their workflow, allowing them to generate first drafts of content in minutes.
While preparing a report for a government client, one of the consultants uses GenAI to produce a summary of recent international research. The AI-generated draft includes several citations and a compelling statistic attributed to an internationally renowned organisation. The consultant includes the information in the client deliverable with only a light review, trusting the AI’s output.
However, the client attempts to follow up on the statistic and finds that the cited report does not exist. Upon further investigation, it turns out the statistic and reference were entirely fabricated by the AI, a known phenomenon called a hallucination. The client raises concerns about the credibility of the work and requests a full review of all sources and claims in the report. This unexpected rework costs ChoiceConsulting additional time and damages their professional reputation.
In response, ChoiceConsulting revises their internal processes. All AI-generated outputs are now reviewed by a human with subject-matter knowledge, and every citation must be verified against original, reputable sources before inclusion. They introduce a step in their workflow to replace or remove any unverifiable content, and incorporate a disclaimer in all draft materials that clearly explains how GenAI was used and what human verification was undertaken.
Human-in-the-loop decision-making
AI systems, and particularly those using advanced machine learning methods such as deep learning, are often treated as ‘black boxes’ due to their complex inner workings.
Depending on the way an AI system is being used, some level of real-time human intervention may be needed. In most situations, the level of human review will depend on the proven level of reliability of the system to deliver desired outputs, combined with a business’s assessment of their specific tolerance for the impact of inaccuracy.
If an AI system is doing something low-risk – like suggesting a movie or a product – it might not need a person to check its work. This is often referred to as ‘human-out-of-the-loop’.
If an AI system is helping to make big decisions – like those about money, health, or the law – a person should always be involved. This is called ‘human-in-the-loop’. The level of human involvement will likely depend on the risk associated with those decisions, or the cost of harm of a ‘wrong’ decision. This can range from having a human consistently make the final decision (informed by an AI output), or having human oversight over particularly high-risk or low-confidence AI results.
For example, an AI system may be used to analyse transaction patterns to detect fraud, but given the potentially high cost of incorrect decisions (which could lead to financial losses, regulatory penalties, or reputational damage), high-risk transactions are flagged for human review. This can be considered ‘augmented intelligence’ – the AI assists with scanning transactions for suspicious patterns, but the fraud analyst will make the final decision.
Humans can over-rely on automated systems or algorithms and accept outputs as correct without critical evaluation. This is ‘automation bias,’ and can undermine the purpose of having human oversight as individuals are not effectively acting as an independent check.
Overall, regardless of the supporting technology used, businesses are responsible for their decisions, so robust checking and due diligence, including adequate recordkeeping and documentation, is crucial.
Tips
Scenario D: DermaDupe
DermaDupe is a rapidly growing skincare e-commerce brand. Within its first year, it had attracted a small but loyal customer base, and built a sleek online presence using a third-party web storefront platform (ShoppingCommerce). To scale quickly, stand out in a competitive market, and boost conversions from website views to purchases, DermaDupe integrated AI tools from the ShoppingCommerce App Store into their website.
These included a chatbot for customer service, a conversion optimiser, a customer review generator, and a product description generator.
Not long after implementing these AI tools, the company received a number of phone calls from customers complaining of a number of issues including that:
- the website had informed them that there were only one or two units of their favourite product in stock, but had allowed them to purchase more (when they attempted to, worrying they would run out)
- some product descriptions were inaccurate – for example, claiming all natural ingredients when that was not the case – in contrast with the positive reviews on the same page
- when attempting to make their complaints via the online chatbot, they received incorrect information about their option to receive a refund and there was no option to speak to a human.
DermaDupe discovered that the AI tools they had implemented:
- were creating false scarcity claims, unrelated to actual stock levels, in order to trigger customers’ psychological urgency to purchase the product
- made false and misleading claims about products in attempt to generate product descriptions that were appealing to the DermaDupe’s customer base
- had automatically posted reviews under fake customer profiles
- did not take into account New Zealand regulation such as the Consumer Guarantees Act.
On learning this, DermaDupe disabled the ShoppingCommerce auto-publishing feature for product descriptions and reviews. They replaced the false reviews with verified customer feedback, and ensured human review for all AI-generated descriptions prior to publication. They retrained the customer chatbot’s understanding of customers’ options for return, and ensured it provided options to speak to a human in the event it could not answer a customer query.
Luckily, DermaDupe were able to identify and rectify these issues early thanks to their customer feedback mechanisms and responsiveness. Otherwise, they risked breaching the Fair Trading Act 1986 for misleading consumers – which could have attracted substantial financial penalties.
* Indicates content specific to GenAI