AI Models Under Scrutiny: European Requirements Pose Challenges

  • LatticeFlow develops a tool for evaluating and optimizing AI models in accordance with EU regulations.
  • Leading AI models struggle with European regulations regarding cybersecurity and discrimination.

Eulerpool News·

A new report indicates that some leading Artificial Intelligence (AI) models are struggling to meet stringent European regulations, particularly in the areas of cybersecurity and non-discriminatory outcomes. These unresolved issues in AI systems necessitate action, as the EU plans comprehensive regulations for generative AI, which will significantly impact the field of general artificial intelligence in the future. The Swiss startup LatticeFlow, in collaboration with renowned researchers and with the involvement of European officials, has developed an assessment tool that tests AI models from tech giants like Meta and OpenAI in various categories. The result: several models achieved an average rating of 0.75 or higher, but LatticeFlow's "Large Language Model (LLM) Checker" also identified weaknesses in key areas. Non-compliance with regulations may lead to severe fines in the millions or a share of annual turnover. The review revealed that discriminatory results continue to be an ongoing issue in the development of generative AI models. OpenAI's "GPT-3.5 Turbo" scored a rather disappointing 0.46 in this area, while Alibaba's "Qwen1.5 72B Chat" fared even worse in the same category. Additionally, security gaps such as "Prompt Hijacking" were tested, with Meta's "Llama 2 13B Chat" and Mistral's "8x7B Instruct" also receiving poor ratings. However, "Claude 3 Opus" by Anthropic emerged as the best model, with an impressive rating of 0.89. Petar Tsankov, CEO and co-founder of LatticeFlow, pointed out that the overall positive test results provide providers a clear path to optimize their models in compliance with the law. This is important given the EU guidelines are not yet fully established. The LLM Checker is available to developers for free to test their models' compliance online. The European Commission welcomes this test as a first step in translating new laws into concrete technical requirements, even though it does not perform an external validation of tools.
EULERPOOL DATA & ANALYTICS

Make smarter decisions faster with the world's premier financial data

Eulerpool Data & Analytics