打造自适应AI运维智慧体：大语言模型在软件日志运维的实践（29页 PPT）

9.28 MB 29 页 0 下载 3 浏览 0 评论 0 收藏

语言	格式	评分
中文（简体）	.pptx	3
概览
打造自适应 AI 运维智慧体：大语言模型在软件日志运维的实践刘逸伦华为 2012 实验室本科毕业于南开大学，硕士毕业于美国佐治亚理工学院。研究方向包括 AI 智能运维，大模型质量评估以及大模型提示策略，在相关领域以第一作者、通讯作者身份在 ICDE 、 ICSE 、 IWQoS 等顶级国际会议 / 期刊发表 10 余篇论文。刘逸伦华为 2012 文本机器翻译实验室工程师演讲嘉宾 1. 软件日志运维观点 2. 自适应智慧体在运维领域面临的 Gap 3. 大模型 Prompt 引擎助力自适应运维智慧体 4. 大模型知识迁移打造运维专精模型 5. 未来畅想目录 CONTENTS PART 01 软件日志运维观点：智能运维演进趋势是从任务数据驱动到自适应运维智慧体 (1) 日志是机器语言：大规模网络、软件系统在运行过程中每天会产生 PB 级别的日志，这些日志是一些类自然语言的文本，实时描述了设备的运行状态、异常情况。 (2) 传统网络运维是机器语言的人工翻译过程：为了维护网络的稳定，运维人员会持续监控设备的运行状态，希望准确、及时地检测异常和突发事件。网络日志是设备运行维护最重要的数据源，运维人员通常会通过解读日志中的自然语言、语义信息来发现问题、分析根因。 (3) 自动日志分析是机器语言的自动翻译过程：日志文本种类繁多、数量庞大，且多数日志为非结构化文本，无法通过人工方式监控和检测全部的日志。更重要的是，分析设备日志需要丰富的领域知识，耗时耗力；简单的规则配置也无法理解文本的语义信息。化⽂本类自然语⾔半结构观点 1 ：软件日志运维是从机器语言向自然语言的转化转化表：一些网络基础设施中的日志消息，日志中的详细信息和自然语言有一定的相似性 Action 机器语言系统事件异常告警荨 - - - - - - - - - - - - - - - - - - - - 《 - - - - - - - - - - - - - - - - - - - - 自动日志运维 Action Action Action 运维对象自然语言状态报表分析报告根因树决策 Action 代际输入方法目标研究成果类别第一代离散特征和 KPI 特征识别及统计算法拟合异常结果 Ft-tree LogParse 任务数据驱动第二代日志文本生成 token 深度学习拟合异常结果 LogAnomaly LogStamp 第三代段落日志和跨域日志预训练语言模型日志语言理解 BigLog Da-Parser 第四代原始日志和自然语言文本大语言模型可解释性运维 LogPrompt 指令驱动第五代自适应运维智慧体：目标自适应、领域自适应、强交互性、可执行性。。。表： LogAIBox 研究项⽬代际演进思路 [1]LogAnomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs (IJCAI 2019) [2]LogParse: Making Log Parsing Adaptive through Word Classification. (ICCCN 2020) [3]LogStamp: Automatic Online Log Parsing Based on Sequence Labelling. (WAIN Performance 2021) [4]BigLog:Unsupervised Large-scale Pre-training for a Unified Log Representation. (IWQoS 2023) [5]DA-Parser: A Pre-trained Domain-Aware Parsing Framework for Heterogeneous Log Analysis. (COMPSAC 2023) [6] Logprompt: Prompt engineering towards zero-shot and interpretable log analysis. (ICSE 2024 & ICPC 2024) 团队 repo 地址： https://github.com/LogAIBox 观点 2 ：智能运维演进趋势：从任务数据驱动到自适应运维智慧体 PART 02 自适应智慧体在运维领域面临的 Gap ：传统自动运维模型既没法“自适应”，也仅是有限“智慧” Gap1: 传统智能运维算法依赖于任务标注数据，仅仅是拟合数据，对于新领域无法自适应 10 在线场景下，由于频繁的软件更新、第三方插件等，大部分产生的日志都是模型未见过的，难以获得足量的历史标注数据，需求模型有自适应能力。当任务训练数据减少时，传统方法普遍出现了预测精度下降。因此，要将其应用到私有系统中，必然需要大量标注数据。 Performance Upgrading Fixing bugs New features （ 1 ）传统日志分析算法只输出“告警 / 正常”，对于异常日志无反馈，需要专家阅读相关日志模板，人力整理生成分析报告，费时费力。（ 2 ）只给出预测结果，对于报假警、漏报等情况不能很快地排除，需要结合原始日志进行分析排查。现有方法基于任务数据可以自动映射故障现象，但依然没有完成智能运维的最后一步：根因分析和故障自恢复。这些系统的交互设计缺乏反馈与互动，离“智慧体”距离遥远。 Gap2: 传统运维系统可解释性差、可交互性弱，智慧有限运维智慧体愿景：并非数据驱动，而是指令驱动，可以进行根因查找与自我纠偏，充当设备系统与工程师之间交流沟通的桥梁根据本轮分析结果由大语言模型自动生成了分析报告，推荐解决方案。对异常日志生成了解释，可以快速判断虚报、漏报。 11 PART 03 大模型 Prompt 引擎助力自适应运维智慧体： LogPrompt ：利用 Prompt 工程激发大模型运维潜能，零样本推断 + 可解释性 LogPrompt 解决传统日志分析两大 Gap 传统方法 LogPrompt 依赖于任务数据，专家标注耗时耗力，自适应性差智慧有限，可解释性差，直接输出告警结论，无法实现告警事件分析 • 以思维链提示引擎激发大语言模型的领域文本分析能力和根因推理能力，在告警日志纷杂的信息中梳理思维链逻辑， AI 模型端到端生成事件分析总结，快速判断漏报、误报，找出根因。 • 根据用户需求描述，以多轮对话的方式灵活地提供告警查询、定位、分析服务。无需训练资源，可灵活迁移至不同设备应用 • 依托大模型预训练阶段内生通用知识，不再单独进行领域微调 • 基于 Prompt 策略注入领域专家对齐信息，快速灵活迁移增强分析结果的可解释性、可交互性 LLM 作为运维智慧体的潜力与挑战 : 大模型有强语言泛化与解释能力，但是对 Prompt 敏感 Unlike existing deep learning models, LLMs (such as ChatGPT) has strong language generating ability and can handling complex writing tasks (Gap 2) like email, report, etc. Log interpretation can be seen as a domain writing task. Large language models (LLMs) have powerful generalization ability to unseen user instructions (Gap 1), and may also be able to handle unseen logs in the online situation of log analysis. The primary objective of LogPrompt is to enhance the correctness and interpretability of log analysis in the online scenario, through exploring a proper strategy of prompting LLMs. Since log analysis is a domain-specific and non-general NLP task, directly applying a simple prompt to LLMs can result in poor performance. In our preliminary experiments, ChatGPT with the simple prompt achieved an F1-score of only 0.189 in anomaly detection. However, our best prompt strategy outperformed the simple prompt by 0.195 in F1-score. proposed in NLP tasks, such as CoT, ToT, etc. There are many prompt philosophies The concept of chain of thought (CoT), a series of intermediate reasoning steps, is introduced by Wei et al.[1]. The CoT prompt emulates the human thought process by requiring the model to include thinking steps when addressing complex problems and can enhance the performance of LLMs in challenging tasks, such as solving mathematical problems. Advantages of CoT prompt • Break down unseen problems into manageable steps (Gap 1) • Enhance interpretability and transparency of LLM output (Gap 2) • Unleash the learned abilities in the pre-training phase 引入 chain-of-thought (CoT) prompt 策略可以激发 LLM 解决日志分析挑战的能力 [1] J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le,D. Zhou et al., “Chain-of-thought prompting elicits reasoning in large language models,” Advances in Neural Information Processing Systems, vol. 35, pp. 24 824–24 837, 2022. The CoT Prompt in the original CoT paper put an example with intermediate steps before an input math problem. So that the model is encouraged to follow the thinking style in the example. LogPrompt 探索 : 将 CoT prompt 的思想引入日志分析任务 In manual log analysis, practitioners also engage in a series of reasoning steps to reach a conclusion. For instance, without further definitions, the boundary between a normal log and an abnormal log is unclear. To emulate the thinking process of O&M engineers, we propose two variants of CoT prompt for log analysis: • Implicit CoT: Human mostly has reasons before conclusion. Thus, in the prompt, we require the LLM to generate a reason for each normal/abnormal answer, justifying its decisions. • Explicit CoT: We further explicitly define intermediate steps to regulate the thinking process. For example, in the task of anomaly detection, we constrain the definition of anomaly to be only “alerts explicitly expressed in textual content” and define four steps for anomaly detection. Task description CoT Component 十 Input logs Performing analysis based on Task description, Input logs and CoT Component Task description Input logs Performing analysis based on Task description and Input logs Implicit concisely explain your reason for each log. Explicit LogPrompt (CoT) Standard Prompt Self-prompt: this strategy involves the LLM suggesting its own prompts. A meta-prompt describing the task asks the LLM to generate prompt prefix candidates. These candidates are then tested on a specific log dataset (in our case the first 100 logs from Android), with the most effective prompt chosen based on performance. Format Control: We employ two functions, fx ([X]) and f Z ([Z], to establish the context for the input slot [X] and the answer slot [Z] in the prompt. S is a text string describing the desired answer value range, like “a binary choice between abnormal and normal”, or “a parsed log template” . In-context Prompt: This approach uses several samples of labeled logs to set the context for the task. The LLM then predicts on new logs, using the context from the sample logs. LogPrompt 探索 : 其他可以在日志分析任务中应用的 Prompt 策略 • In our primary experiments, the underlying LLM is accessed via APIs provided by external services. • The initial temperature coefficient is set to 0.5, maintaining a balance by increasing the model's reasoning capabilities through diverse token exploration while limiting detrimental randomness. • If the response format is invalid, the query is resubmitted with an increased temperature coefficient of 0.4 until the response format is correct. The format failure rate is less than 1%, which is consistent with existing literature. • The train