Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 2 Next »

Hypothesis

The Orca2 LLM can be an effective Named Entity Recognition and Relation Extraction tool for aircraft system descriptions.

Setup

For this experiment, I am using ORCA2 7B GGUF model Q5_K_M. It tends to work quickly using the accelerate package and is recommended by the author for better fidelity and performance.

I decided to use Langchain to simplify use of the model so I can focus on the actual experiment.

Location

Github Location: ai4safety/MikeWorkspace/Goal1/TestNotebook.ipynb

Model Config

'context_length':       8000, 
'max_new_tokens':       4096,
'gpu_layers':           32,
'temperature':          0.3, 
'top_p':                0.45,
'top_k':                18,
'repetition_penalty':   1.1, 
  • Low temperature is preferred as to not create any creative data.

  • top_p, top_k and repitition_penalty were set based on randomized trials over many runs.

  • context_length is set to an arbitrarily large number, this can affect performance, but so far so good.

  • max_new_tokens is set to the llama2 recommended value.

  • gpu_layers is with respect to what my RTX 3070Ti Mobile can handle without locking up.

System Prompt

The system prompt is created using langchain prompt templates, using two input variables, {input} and {json}, which will be the system description to be analyzed, and the example JSON output.

The system prompt tested for NER is:

        You are an aircraft systems expert.
        Perform the task of Named Entity Recognition (NER) on the input text.
        You must format your output as a JSON vlaue that adheres to the example below.
        Your output will be parsed and type-checked!
        
        Example text:
        The engine connects to the fuel pump. 
        The Avionics depends on the Electrical System.
        
        Results in the json:
        {json}
        
        ### Input:
        {input}
        
        ### Response:
        Valid JSON that adheres the requirements above and format of the example json output.
        """

The system prompt used for RE is:

        You are an aircraft systems expert.
        Perform the task of Relation Extraction (RE) on the input text.
        You must format your output as a JSON vlaue that adheres to the example below.
        Your output will be parsed and type-checked!
        
        Example text:
        The engine connects to the fuel pump. 
        The Avionics depends on the Electrical System.
        
        Results in the json:
        {json}
        
        ### Input:
        {input}
        
        ### Response:
        Valid JSON that adheres the requirements above and format of the example json output.

JSON Structure

# output json structures
example_NER_json = {
    "entities": [
        {
            "name": "Engine",
            "type": "SystemComponent",
        },
        {
            "name": "Fuel Pump",
            "type": "SystemComponent",
        },
        {
            "name": "Avionics",
            "type": "System",
        },
        {
            "name": "Electrical System",
            "type": "System",
        }
    ]
}

example_RE_json = [
    {
        "entity1": "Engine",
        "entity2": "Fuel Pump",
        "relation_type": "Connects To"
    },
    {
        "entity1": "Avionics",
        "entity2": "Electrical System",
        "relation_type": "Depends On"
    }
]

Runs Using Data From Real System Descriptions

Source Data: https://openmbee.atlassian.net/l/cp/c2Qmroq2

#

Orca2 Output

P1

P2

P3

P4

P5

P6

P7

P8

P9

P10

  • No labels