Use and outputs

Having humans-in-the-loop, and understanding how to use artificial intelligence systems effectively and securely, is key to responsible artificial intelligence use and deployment.

Artificial intelligence (AI) tools can be helpful, but they don’t always get things right. That’s why it’s important to think carefully before using AI results to make big decisions – especially ones that affect people’s lives. AI should support the work people do, not replace their judgement. It’s best when humans and AI work together, with people staying responsible for the final choices.

GenAI user inputs

* GenAI tools work by responding to prompts – the questions or instructions you give them. The information included in any prompt (for example the images you provide, or text you type into an LLM) can affect what the tool gives back, and sometimes even how it learns and responds over time.

* Some GenAI tools, particularly those that are publicly available or free of charge, may share prompt data (including attachments) with external parties (including developers or other users). This means information provided through prompts could be exposed in future through a security breach or the model itself.
* Even if your prompts or attachments don’t include names or are otherwise anonymised or depersonalised, personal, sensitive or classified information could be joined up with other information over time to re-identify someone.

Tips

Educate users on prompting limitations, requirements or guidance.
Prompt engineering courses are available from a variety of providers, many of which are available online and/or free of charge. These can be particularly effective if paired with organisational guidance that takes into account the unique business context.
Business usage policies can help communicate to staff what tools are endorsed, and how they can be used.
Providers of GenAI tools can support responsible use by proactively providing the parameters of their tools, and information on whether prompts will be reused as training data or otherwise stored.
Consider the source of any prompts and data being inputted and whether doing so breaches any rights (privacy, intellectual property, confidentiality). Inputting your own copyright works could risk your control or exclusivity of the input or works generated from AI systems trained on that input data.
Tools are available to flag inappropriate prompts.

Advice for prompting

Be specific and prescriptive in queries, including in regards to relevant context and the desired tone or style.
If using a large language model (LLM), consider asking for step-by-step instructions to help with understanding the response and pinpoint if and where errors are being made.
Provide examples of what you are looking for to guide the type and structure of response.
Include instructions to be clear if/when the model is unsure of a response.
Ask for information sources and references so you can verify output accuracy.

Use of GenAI outputs

* GenAI can be used in a variety of ways including to improve customer experience, for marketing and content creation, and for productivity gains. However, there are some potential risks when it comes to GenAI outputs and their use:

* LLMs are designed to generate “statistically probable language patterns”, which means different (though often similar) answers are likely to be given to the same prompt. Businesses should therefore not rely on any specific answer being produced in response to a specific prompt.
* LLMs may be susceptible to errors, omissions, training bias and factual inaccuracy. While responses can seem well developed and credible, they may confidently present opinion as fact, skip important details depending on the prompt given, and in some instances fabricate information as truth. This tendency is referred to as ‘hallucination’.
* Depending on the training data and prompt used, outputs may also reflect prevalent, often biased, out-of-date or unethical viewpoints. These tendencies could impact Māori and other Indigenous, minority, or otherwise disproportionately disadvantaged communities (including women, older people, young people and children, disabled people) disproportionately.
* Some GenAI tools that generate realistic images or videos of a person or their voice, can be used maliciously. These are called ‘deepfakes’ – realistic but fabricated audio, video, or images that convincingly mimic real individuals. Malicious actors can use them to spread misinformation, defame individuals, or enact scams and exploitation. The technology has also been misused to create explicit content.
* GenAI can be, and is often used, to generate new and novel content and material. However, outputs may lack commercial protection or ownership (including given issues assessing level of ‘human authorship’ and establishing originality). There is also always a risk that model outputs may (intentionally or inadvertently) be substantially similar to existing copyright works (whether or not those works were used to train the model) and therefore be potentially subject to plagiarism and copyright infringement claims.
* GenAI has been used to replicate components of Māori culture and traditional knowledge without appropriate permission and attribution. New Zealand’s intellectual property laws include provisions for the protection of mātauranga Māori, in relation to patents and trade marks. These provisions help prevent the registration of trade marks or granting of patents that would be considered offensive by Māori or contrary to Māori values. Te Puni Kōkiri (the Ministry of Māori Development) is working to address regulatory barriers to commercialisation of mātauranga Māori (traditional knowledge).

Tips

Offer information and training on:
- the components of a good prompt, to support usefulness of GenAI outputs. For example, asking an LLM to act as its own editor or proofreader can help to reduce errors. See GenAI user inputs.
- fact-checking and cross-referencing LLM outputs as appropriate, especially where they are used for decision-making or other work where there is significant negative impact associated with inaccuracy.
- necessary steps to treat sensitive, personal or otherwise protected information that may be produced by GenAI with appropriate care to prevent unauthorised sharing, disclosure or use.
Check terms and conditions and specifications of GenAI models before use, and/or establish written agreements with AI providers to create certainty of output ownership. See Procurement and Legal and ethical data for more information.
Be transparent about who ‘owns’ any outputs (responses or content) and any conditions on what those outputs may be used for.
AI watermarking can support the public to identify what is real and what is AI-generated, (though can be susceptible to manipulation and/or watermark removal).
Consider AI watermarking as a way to support the public to identify what is real and what is AI-generated (though noting it can be susceptible to manipulation and/or watermark removal).
Consider how the system will be used operationally and aspects like user interface – for example, it is less likely a user will review a generated email before sending if it is entered directly into the email window and they only have to click ‘send’.
If using GenAI, potentially to assist in IP production, you may wish to keep your own records of how it is used (prompts, edits, decisions etc.). Internationally, there are AI systems with inbuilt tools to help track and document human authorship behind AI-assisted creations (for example Invoke AI’s Provenance Records).
Uses where IP rights are not essential could be for internal business use, idea generation, production planning or scheduling, brainstorming or mind-mapping, and streamlining.
Work closely with Māori to manage and understand potential impacts, especially where Māori data is used, on services or products that affect Māori communities.

Scenario C: ChoiceConsulting

ChoiceConsulting is a small independent policy consultancy in New Zealand that has started using generative AI to increase efficiency in producing background briefs, literature scans, and contextual overviews for clients in the public and private sectors. The team finds that the AI tool significantly speeds up their workflow, allowing them to generate first drafts of content in minutes.

While preparing a report for a government client, one of the consultants uses GenAI to produce a summary of recent international research. The AI-generated draft includes several citations and a compelling statistic attributed to an internationally renowned organisation. The consultant includes the information in the client deliverable with only a light review, trusting the AI’s output.

However, the client attempts to follow up on the statistic and finds that the cited report does not exist. Upon further investigation, it turns out the statistic and reference were entirely fabricated by the AI, a known phenomenon called a hallucination. The client raises concerns about the credibility of the work and requests a full review of all sources and claims in the report. This unexpected rework costs ChoiceConsulting additional time and damages their professional reputation.

In response, ChoiceConsulting revises their internal processes. All AI-generated outputs are now reviewed by a human with subject-matter knowledge, and every citation must be verified against original, reputable sources before inclusion. They introduce a step in their workflow to replace or remove any unverifiable content, and incorporate a disclaimer in all draft materials that clearly explains how GenAI was used and what human verification was undertaken.

Human-in-the-loop decision-making

AI systems, and particularly those using advanced machine learning methods such as deep learning, are often treated as ‘black boxes’ due to their complex inner workings.

Depending on the way an AI system is being used, some level of real-time human intervention may be needed. In most situations, the level of human review will depend on the proven level of reliability of the system to deliver desired outputs, combined with a business’s assessment of their specific tolerance for the impact of inaccuracy.

If an AI system is doing something low-risk – like suggesting a movie or a product – it might not need a person to check its work. This is often referred to as ‘human-out-of-the-loop’.

If an AI system is helping to make big decisions – like those about money, health, or the law – a person should always be involved. This is called ‘human-in-the-loop’. The level of human involvement will likely depend on the risk associated with those decisions, or the cost of harm of a ‘wrong’ decision. This can range from having a human consistently make the final decision (informed by an AI output), or having human oversight over particularly high-risk or low-confidence AI results.

For example, an AI system may be used to analyse transaction patterns to detect fraud, but given the potentially high cost of incorrect decisions (which could lead to financial losses, regulatory penalties, or reputational damage), high-risk transactions are flagged for human review. This can be considered ‘augmented intelligence’ – the AI assists with scanning transactions for suspicious patterns, but the fraud analyst will make the final decision.

Humans can over-rely on automated systems or algorithms and accept outputs as correct without critical evaluation. This is ‘automation bias,’ and can undermine the purpose of having human oversight as individuals are not effectively acting as an independent check.

Overall, regardless of the supporting technology used, businesses are responsible for their decisions, so robust checking and due diligence, including adequate recordkeeping and documentation, is crucial.

Tips

Humans-in-the-loop should understand the risk of automation bias, identify other biases they may bring, and be well equipped to serve the purpose of their involvement (for example, have the subject matter expertise to recognise inaccuracies).
Confidence, certainty or accuracy percentages can be provided alongside outputs to help gauge whether it should be used or relied upon.
Where possible, systems and processes should be designed to support explainability. This creates understanding of how an AI system has come to a conclusion, helps build confidence in its outputs, and can support responses to any customer queries. Saliency maps, ‘what if’ scenarios, and other explainability features can support this.
Where there is very low risk appetite for inaccuracies, tasks can be divided into intermediary outputs that can be easily checked or verified by humans – so it is clearer how an end output was arrived at and what checks and balances have taken place. Mandatory verification steps can also be put in place.

Scenario D: DermaDupe

DermaDupe is a rapidly growing skincare e-commerce brand. Within its first year, it had attracted a small but loyal customer base, and built a sleek online presence using a third-party web storefront platform (ShoppingCommerce). To scale quickly, stand out in a competitive market, and boost conversions from website views to purchases, DermaDupe integrated AI tools from the ShoppingCommerce App Store into their website.

These included a chatbot for customer service, a conversion optimiser, a customer review generator, and a product description generator.

Not long after implementing these AI tools, the company received a number of phone calls from customers complaining of a number of issues including that:

the website had informed them that there were only one or two units of their favourite product in stock, but had allowed them to purchase more (when they attempted to, worrying they would run out)
some product descriptions were inaccurate – for example, claiming all natural ingredients when that was not the case – in contrast with the positive reviews on the same page
when attempting to make their complaints via the online chatbot, they received incorrect information about their option to receive a refund and there was no option to speak to a human.

DermaDupe discovered that the AI tools they had implemented:

were creating false scarcity claims, unrelated to actual stock levels, in order to trigger customers’ psychological urgency to purchase the product
made false and misleading claims about products in attempt to generate product descriptions that were appealing to the DermaDupe’s customer base
had automatically posted reviews under fake customer profiles
did not take into account New Zealand regulation such as the Consumer Guarantees Act.

On learning this, DermaDupe disabled the ShoppingCommerce auto-publishing feature for product descriptions and reviews. They replaced the false reviews with verified customer feedback, and ensured human review for all AI-generated descriptions prior to publication. They retrained the customer chatbot’s understanding of customers’ options for return, and ensured it provided options to speak to a human in the event it could not answer a customer query.

Luckily, DermaDupe were able to identify and rectify these issues early thanks to their customer feedback mechanisms and responsiveness. Otherwise, they risked breaching the Fair Trading Act 1986 for misleading consumers – which could have attracted substantial financial penalties.

* Indicates content specific to GenAI

< Data and modelling | Continuing the conversation >

https://www.mbie.govt.nz/business-and-employment/business/support-for-business/responsible-ai-guidance-for-businesses/artificial-intelligence-system-specific-considerations/use-and-outputs
Please note: This content will change over time and can go out of date.

Building and construction

Tenancy and housing

Energy and natural resources

Employment and skills

Consumer Protection

Business

Economic growth

Isolation and quarantine

Immigration

Tourism and hospitality

Tourism research and data

Science and innovation

New Zealand Space Agency

Communications and Broadband

Government Centre for Dispute Resolution

Regulatory stewardship

New Zealand Government Procurement and Property

Language Assistance Services

Who we are

Open government and official information

Contact us

Upcoming events

Use and outputs

On this page

GenAI user inputs

Use of GenAI outputs

Human-in-the-loop decision-making