<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>云中漫步</title>
  
  <subtitle>太阳强烈，水波温柔</subtitle>
  <link href="https://yang-makabaka.github.io/atom.xml" rel="self"/>
  
  <link href="https://yang-makabaka.github.io/"/>
  <updated>2023-12-11T09:27:03.704Z</updated>
  <id>https://yang-makabaka.github.io/</id>
  
  <author>
    <name>Yang.f.z.</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>Perception安装</title>
    <link href="https://yang-makabaka.github.io/posts/d8562c92.html"/>
    <id>https://yang-makabaka.github.io/posts/d8562c92.html</id>
    <published>2023-11-29T04:12:03.000Z</published>
    <updated>2023-12-11T09:27:03.704Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Perception-安装"><a href="#Perception-安装" class="headerlink" title="Perception 安装"></a>Perception 安装</h1><p>百度Apollo自动驾驶仿真平台9.0版本Perception模块相关内容</p><p>基于官方教程<a href="https://apollo.baidu.com/community/article/1186">https://apollo.baidu.com/community/article/1186</a>，根据实际使用增改部分内容。</p><h1 id="1-Apollo-Perception环境配置"><a href="#1-Apollo-Perception环境配置" class="headerlink" title="1 Apollo Perception环境配置"></a>1 Apollo Perception环境配置</h1><h2 id="1-1-安装基础软件"><a href="#1-1-安装基础软件" class="headerlink" title="1.1 安装基础软件"></a>1.1 安装基础软件</h2><h3 id="1-1-1-安装Linux-Ubuntu"><a href="#1-1-1-安装Linux-Ubuntu" class="headerlink" title="1.1.1 安装Linux - Ubuntu"></a>1.1.1 安装Linux - Ubuntu</h3><p>Ubuntu系统安装完成请更新相关软件：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt-get update</span><br></pre></td></tr></table></figure><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt-get upgrade</span><br></pre></td></tr></table></figure><h3 id="1-1-2-安装-Docker-Engine"><a href="#1-1-2-安装-Docker-Engine" class="headerlink" title="1.1.2 安装 Docker Engine"></a>1.1.2 安装 Docker Engine</h3><p>Apollo 依赖于 Docker 19.03+。安装 Docker 引擎，您可以根据官方文档进行安装：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">wget http://apollo-pkg-beta.bj.bcebos.com/docker_install.shbash docker_install.sh</span><br></pre></td></tr></table></figure><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">bash docker_install.sh</span><br></pre></td></tr></table></figure><p>注：1.1.1和1.1.2步骤安装过无需重复安装。</p><h3 id="1-1-3-安装驱动"><a href="#1-1-3-安装驱动" class="headerlink" title="1.1.3 安装驱动"></a>1.1.3 安装驱动</h3><p>显卡驱动和CUDA版本兼容性，由于nvidia的硬件更新的很快，因此会遇到显卡驱动和CUDA版本不兼容的情况，以下为我们测试的畅通链路。</p><div class="table-container"><table><thead><tr><th>显卡系列</th><th>测试显卡</th><th>驱动版本</th><th>最低支持驱动版本</th><th>cuda版本</th></tr></thead><tbody><tr><td>GeForce 10 Series</td><td>GeForce GTX 1080</td><td>nvidia-driver-470.160.03</td><td>nvidia-driver-391.35</td><td>CUDA Version ：11.4</td></tr><tr><td>GeForce RTX 20 Series</td><td>GeForce RTX 2070 SUPER</td><td>nvidia-driver-470.63.01</td><td>nvidia-driver-456.38</td><td>CUDA Version ：11.4</td></tr><tr><td>GeForce RTX 30 Series</td><td>GeForce RTX 3090</td><td>nvidia-driver-515.86.01</td><td>nvidia-driver-460.89</td><td>CUDA Version ：11.6</td></tr><tr><td></td><td>GeForce RTX 3060</td><td>nvidia-driver-470.63.01</td><td>nvidia-driver-460.89</td><td>CUDA Version ：11.4</td></tr><tr><td>Tesla V-Series</td><td>Tesla V100</td><td>nvidia-driver-418.67</td><td>nvidia-driver-410.129</td><td>CUDA Version ：10.1</td></tr><tr><td>AMD</td><td>MI100 dGPU</td><td>ROCm™ 3.10 driver</td><td></td></tr></tbody></table></div><h3 id="1-1-3-1-安装显卡驱动"><a href="#1-1-3-1-安装显卡驱动" class="headerlink" title="1.1.3.1 安装显卡驱动"></a><strong>1.1.3.1 安装显卡驱动</strong></h3><p><strong>10、20、30系列显卡推荐使用470.63.01版本</strong>，下载链接<a href="https://www.nvidia.cn/Download/driverResults.aspx/179605/cn/">470.63.01显卡驱动</a></p><p>（实际使用时显卡驱动版本高于推荐也可正常使用，因此在安装系统时已将驱动安装好的话，则不用安装，再次安装会提示已安装驱动无法再次安装）：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#使用该命令查看是否安装显卡，若出现下述“驱动检查”所示图内容，则已安装显卡驱动，否则执行下面指令安装</span></span><br><span class="line">nvidia-smi</span><br></pre></td></tr></table></figure><p>下载之后，找到相应的文件夹打开终端输入指令：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo chmod <span class="number">777</span> NVIDIA-Linux-x86_64-<span class="number">470.63</span><span class="number">.01</span>.runsudo ./NVIDIA-Linux-x86_64-<span class="number">470.63</span><span class="number">.01</span>.run</span><br></pre></td></tr></table></figure><p><strong>驱动检查</strong></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">nvidia-smi</span><br></pre></td></tr></table></figure><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/9ae7ced2c94a36b08552b43487a260a83589a428">https://apollo-studio-public.bj.bcebos.com/community/article/image/9ae7ced2c94a36b08552b43487a260a83589a428</a></p><p>注：如若出现以下情况，则说明没有下载显卡驱动﻿<a href="https://apollo.baidu.com/community/article/1181">https://apollo.baidu.com/community/article/1181</a></p><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/93433bba9bac0293fd6df871083ad0f8157ef648">https://apollo-studio-public.bj.bcebos.com/community/article/image/93433bba9bac0293fd6df871083ad0f8157ef648</a></p><p>注：<strong>本教程只适用于ubuntu系统</strong>，虚拟机无法安装显卡驱动</p><h3 id="1-1-3-2-安装nvida-docker"><a href="#1-1-3-2-安装nvida-docker" class="headerlink" title="1.1.3.2 安装nvida-docker"></a><strong>1.1.3.2 安装nvida-docker</strong></h3><p>为了在容器内获得 GPU 支持，在安装完 docker 后需要安装 NVIDIA Container Toolkit。 运行以下指令安装 NVIDIA Container Toolkit：（实际使用时会出现报错：无法定位软件包 nvidia-docker2）</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.<span class="built_in">list</span> | sudo tee /etc/apt/sources.<span class="built_in">list</span>.d/nvidia-docker.<span class="built_in">list</span> sudo apt-get -y update sudo apt-get install -y nvidia-docker2</span><br></pre></td></tr></table></figure><p>注：如果上面方法报错，则使用官方方法<a href="https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html">nvida-docker官方教程</a>，执行下面图片中的两条指令：</p><p><img src="https://s2.loli.net/2023/11/29/VFuhK6yjmSfNLIE.png" alt="Untitled.png"></p><h2 id="1-2-安装-Apollo-环境管理工具"><a href="#1-2-安装-Apollo-环境管理工具" class="headerlink" title="1.2 安装 Apollo 环境管理工具"></a>1.2 安装 Apollo 环境管理工具</h2><h3 id="1-2-1-基础环境准备"><a href="#1-2-1-基础环境准备" class="headerlink" title="1.2.1 基础环境准备"></a>1.2.1 基础环境准备</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 添加访问认证</span></span><br><span class="line">wget -O - https://apollo-pkg-beta.cdn.bcebos.com/neo/beta/key/deb.gpg.key | sudo apt-key add -</span><br><span class="line"></span><br><span class="line"><span class="comment"># Apollo -alpha 版源地址</span></span><br><span class="line">sudo bash -c <span class="string">&quot;echo &#x27;deb https://apollo-pkg-beta.cdn.bcebos.com/apollo/core bionic main&#x27; &gt;&gt; /etc/apt/sources.list.d/apolloauto.list&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 更新源</span></span><br><span class="line">sudo apt update</span><br></pre></td></tr></table></figure><h3 id="1-2-2-安装-aem工具"><a href="#1-2-2-安装-aem工具" class="headerlink" title="1.2.2 安装 aem工具"></a>1.2.2 安装 aem工具</h3><p>如果没有安装过<code>apollo 8.0aem，</code>使用以下命令直接安装：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt install apollo-neo-env-manager-dev</span><br></pre></td></tr></table></figure><p>安装成功后，可以使用以下查看安装是否成功，出现下图所示即为成功：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">aem -h</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/GWdcICVU1NRSbDp.png" alt="Untitled 1.png"></p><h2 id="1-3-下载-Perception工程"><a href="#1-3-下载-Perception工程" class="headerlink" title="1.3 下载 Perception工程"></a>1.3 下载 Perception工程</h2><h3 id="1-3-1-下载工程代码"><a href="#1-3-1-下载工程代码" class="headerlink" title="1.3.1 下载工程代码"></a>1.3.1 下载工程代码</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git clone https://github.com/ApolloAuto/application-perception</span><br></pre></td></tr></table></figure><p>注：如果出现⽆法访问等问题，可使⽤以下⽅法：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git clone [https://gitee.com/ApolloAuto/application-perception](https://gitee.com/ApolloAuto/application-perception)</span><br></pre></td></tr></table></figure><h3 id="1-3-2-进入工程目录"><a href="#1-3-2-进入工程目录" class="headerlink" title="1.3.2 进入工程目录"></a>1.3.2 进入工程目录</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cd application-perception</span><br></pre></td></tr></table></figure><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#检查目录，只是看一下文件结构，并未有数据操作，如下显示&quot;9.0.0-alpha2-r31&quot;即为正确</span></span><br><span class="line">cat .workspace.json</span><br></pre></td></tr></table></figure><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/4ef69657ff72bed3ab517a35535554d762d1cf47">https://apollo-studio-public.bj.bcebos.com/community/article/image/4ef69657ff72bed3ab517a35535554d762d1cf47</a></p><p>注：如若显示<strong>“9.0.0-alpha2-r29”：</strong></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#请使用：</span></span><br><span class="line">git pull</span><br></pre></td></tr></table></figure><p>或手动更改.workspace.json文件的<strong>9.0.0-alpha2-r29</strong>为<strong>9.0.0-alpha2-r31</strong></p><p>再使用cat .workspace.json 指令查看是否已更改</p><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/4ef69657ff72bed3ab517a35535554d762d1cf47">https://apollo-studio-public.bj.bcebos.com/community/article/image/4ef69657ff72bed3ab517a35535554d762d1cf47</a></p><h2 id="1-4-调试perception工程"><a href="#1-4-调试perception工程" class="headerlink" title="1.4 调试perception工程"></a>1.4 调试perception工程</h2><h3 id="1-4-1-进入Docker环境"><a href="#1-4-1-进入Docker环境" class="headerlink" title="1.4.1 进入Docker环境"></a>1.4.1 进入Docker环境</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 拉取并启动docker容器</span></span><br><span class="line">aem start</span><br><span class="line"></span><br><span class="line"><span class="comment"># 进入容器</span></span><br><span class="line">aem enter</span><br></pre></td></tr></table></figure><p>注：在输入aem start后终端应为下图所示</p><p><img src="https://s2.loli.net/2023/11/29/7ADIlCFbjvigULM.png" alt="Untitled 2.png"></p><p>若仍然出现下图所示warning，则<strong>1.1.3.2 安装nvida-docker失败</strong></p><p><img src="https://s2.loli.net/2023/11/29/uHCKAsp2gio1hyW.png" alt="Untitled 3.png"></p><p><strong>检查buildtool版本</strong></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">buildtool -v</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/RFUIQB5bwst398d.png" alt="Untitled 4.png"></p><p>注：如若buildtool版本与上图不一致，即以<strong>9.0.0-alpha</strong>开头的版本，请使用以下指令更新：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt update &amp;&amp; sudo apt install --only-upgrade apollo-neo-buildtool</span><br></pre></td></tr></table></figure><p><strong>升级aem工具</strong></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt install apollo-neo-env-manager-dev</span><br></pre></td></tr></table></figure><p><strong>安装依赖包</strong></p><p><code>会拉取安装core目录下的cyberfile.xml里面所有的依赖包</code></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">buildtool build --gpu</span><br></pre></td></tr></table></figure><p>注：该工程中只有感知功能，如若想添加PnC（planning规划）功能请参考如下链接（可选）</p><p>请参考文章中的<strong>1.1.5升级CCF- BDCI赛事复赛工程</strong><a href="https://apollo.baidu.com/community/article/1180">https://apollo.baidu.com/community/article/1180</a></p><p><strong>另外：</strong>安装的planning等模块的源码会保存到工程文件的modules文件夹中，如果安装后并未出现，可<strong>参考application-perception/core/cyberfile.xml</strong>文件中的内容进行安装，具体使用如下：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">buildtool install xxx</span><br></pre></td></tr></table></figure><p>其中xxx为想要安装的模块名称，例如要安装planning的源码，可查阅<strong>cyberfile.xml</strong>文件，可知其<strong>repo_name</strong>为<strong>”planning”</strong>：</p><p><img src="https://s2.loli.net/2023/11/29/8nNal54LxKHYeyi.png" alt="Untitled 5.png"></p><p>则<strong>安装命令</strong>相应为：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">buildtool install planning</span><br></pre></td></tr></table></figure><p>执行后在<strong>application-perception/modules</strong>文件夹内会出现<strong>planning</strong>源码文件夹：</p><p><img src="https://s2.loli.net/2023/11/29/Oe1UJtHubPYA8Gk.png" alt="Untitled 6.png"></p><h3 id="1-4-2-设置车型参数"><a href="#1-4-2-设置车型参数" class="headerlink" title="1.4.2 设置车型参数"></a>1.4.2 设置车型参数</h3><p>本次赛事用的是apolloscape数据集，车型参数设置为apolloscape参数。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">aem profile use apolloscape</span><br></pre></td></tr></table></figure><h3 id="1-4-3-启动Dreamview"><a href="#1-4-3-启动Dreamview" class="headerlink" title="1.4.3 启动Dreamview+"></a>1.4.3 启动Dreamview+</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">aem bootstrap start --plus</span><br></pre></td></tr></table></figure><p>plus参数指的是启动dreamview+。</p><h3 id="1-4-4-下载安装感知模型"><a href="#1-4-4-下载安装感知模型" class="headerlink" title="1.4.4 下载安装感知模型"></a>1.4.4 下载安装感知模型</h3><p>安装amodel模型管理工具：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">wget https://apollo-pkg-beta.cdn.bcebos.com/perception/amodel-<span class="number">0.2</span><span class="number">.0</span>.tar.gz</span><br><span class="line">pip3 install --user amodel-<span class="number">0.2</span><span class="number">.0</span>.tar.gz</span><br></pre></td></tr></table></figure><p>导入环境变量：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">export PATH=~/.local/<span class="built_in">bin</span>/:$PATH</span><br></pre></td></tr></table></figure><p>安装感知模型：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo ~/.local/<span class="built_in">bin</span>/amodel install center_point_paddle</span><br></pre></td></tr></table></figure><p>安装完后使用命令查看安装的模型：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">amodel <span class="built_in">list</span></span><br></pre></td></tr></table></figure><h3 id="1-4-5-启动lidar感知程序，播包调试-该步骤用于播放record，安装时不用执行"><a href="#1-4-5-启动lidar感知程序，播包调试-该步骤用于播放record，安装时不用执行" class="headerlink" title="1.4.5 启动lidar感知程序，播包调试(该步骤用于播放record，安装时不用执行)"></a>1.4.5 启动lidar感知程序，播包调试(该步骤用于播放record，安装时不用执行)</h3><p>启动lidar感知有两个方法，以下两个方法选择一个。</p><p><strong>1.4.5.1 Dreamview+ 启动</strong></p><p>在<strong>1.4.3启动Dreamview</strong>后，点击左侧<strong>Mode Settings</strong>按钮，Mode选择<strong>Perception：</strong></p><p><img src="https://s2.loli.net/2023/11/29/rHp5z1Z2FIEnVGg.png" alt="Untitled 7.png"><br>    启动<strong>Transform</strong>、<strong>Lidar</strong>感知模块：</p><p><img src="https://s2.loli.net/2023/11/29/Q5skO9KIS7NH2Z4.png" alt="Untitled 8.png"></p><p><strong>1.4.5.2 命令行启动（一般用Dreamview+ 启动即可）</strong></p><p>启动transform：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cyber_launch start /apollo/modules/transform/launch/static_transform.launch</span><br></pre></td></tr></table></figure><p>启动lidar感知：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cyber_launch start /apollo/modules/perception/lidar_output/launch/lidar_output.launch</span><br></pre></td></tr></table></figure><p><strong>1.4.5.3 播包调试感知</strong></p><p>在<strong>Dreamview</strong>观察感知情况。record包的生产参考下面的<strong>数据准备</strong>部分：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># xxx.record是具体record的名称</span></span><br><span class="line">cyber_recorder play -f xxx.record</span><br></pre></td></tr></table></figure><h1 id="2-数据准备"><a href="#2-数据准备" class="headerlink" title="2 数据准备"></a>2 数据准备</h1><h2 id="2-1-数据下载"><a href="#2-1-数据下载" class="headerlink" title="2.1 数据下载"></a>2.1 数据下载</h2><p>训练集、测试集和脚本代码中分别有readme说明。</p><p>一共需要下载下面三个文件，其中前两个大小都在13G左右，需要有足够空间。</p><p>点击下表中的链接会在浏览器直接创建下载任务，不过速度很慢。</p><p>推荐将下方链接复制，然后在windows中使用迅雷新建下载任务，填入复制的链接下载，或者找我拷贝。</p><p><img src="https://s2.loli.net/2023/11/29/An1LSbZEjlzFxBt.png" alt="Untitled 9.png"></p><p><strong>注注注：官方还推出了使用百度ai studio进行训练的教程，如果内存不足，则不要下载，可以参考此链接：<a href="https://apollo.baidu.com/community/article/1184">https://apollo.baidu.com/community/article/1184</a></strong></p><div class="table-container"><table><thead><tr><th>序号</th><th>名称</th><th>相关链接</th><th>说明</th></tr></thead><tbody><tr><td>1</td><td>训练集</td><td><a href="https://apollo-records.bj.bcebos.com/perception/apolloscape/apolloscape_train.zip?authorization=bce-auth-v1/0824ae9513f643518e120667fc2a6d50/2023-11-13T09%3A49%3A45Z/2592000/host/a7870c32d5bd5a38ff679cf70250164b84a77c0556bd5ac8de371050a56cb02b">https://apollo-records.bj.bcebos.com/perception/apolloscape/apolloscape_train.zip?authorization=bce-auth-v1/0824ae9513f643518e120667fc2a6d50/2023-11-13T09%3A49%3A45Z/2592000/host/a7870c32d5bd5a38ff679cf70250164b84a77c0556bd5ac8de371050a56cb02b</a></td><td>ApolloScape的训练数据集</td></tr><tr><td>2</td><td>测试集</td><td><a href="https://apollo-records.bj.bcebos.com/perception/apolloscape/apolloscape_test.zip?authorization=bce-auth-v1/0824ae9513f643518e120667fc2a6d50/2023-11-13T09%3A49%3A03Z/2592000/host/c6c2bdcf29f531ed1bd4194700a179c5cf96e87063e99587454b19335ad9a10e">https://apollo-records.bj.bcebos.com/perception/apolloscape/apolloscape_test.zip?authorization=bce-auth-v1/0824ae9513f643518e120667fc2a6d50/2023-11-13T09%3A49%3A03Z/2592000/host/c6c2bdcf29f531ed1bd4194700a179c5cf96e87063e99587454b19335ad9a10e</a></td><td>ApolloScape的测试数据集</td></tr></tbody></table></div><p>（分数榜单使用的数据集） |<br>| 3 | 脚本代码 | <a href="https://apollo-records.bj.bcebos.com/perception/apolloscape/apolloscape_scripts.zip?authorization=bce-auth-v1/0824ae9513f643518e120667fc2a6d50/2023-11-13T11%3A32%3A35Z/2592000/host/e662b3c0219eeaa374ecbec0e024c9f71e1b7925de9d06a79c6506b035f466f6">https://apollo-records.bj.bcebos.com/perception/apolloscape/apolloscape_scripts.zip?authorization=bce-auth-v1/0824ae9513f643518e120667fc2a6d50/2023-11-13T11%3A32%3A35Z/2592000/host/e662b3c0219eeaa374ecbec0e024c9f71e1b7925de9d06a79c6506b035f466f6</a> | 将ApolloScape数据集转换为KITTI数据集的脚本<br>将ApolloScape数据转换为record的脚本 |</p><h2 id="2-2-adataset环境配置"><a href="#2-2-adataset环境配置" class="headerlink" title="2.2 adataset环境配置"></a>2.2 adataset环境配置</h2><p>adataset用于将apolloscape数据转化为apollo record格式，方便做端到端感知调试。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 在application-perception目录下进入到容器内。如果已经在容器内，则不需要执行。</span></span><br><span class="line">aem enter</span><br></pre></td></tr></table></figure><p>安装adataset：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 更新pip源</span></span><br><span class="line">pip config <span class="built_in">set</span> <span class="keyword">global</span>.index-url https://mirrors.aliyun.com/pypi/simple/</span><br><span class="line"></span><br><span class="line"><span class="comment"># 升级pip</span></span><br><span class="line">python -m pip install --upgrade setuptools</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装adataset</span></span><br><span class="line">pip install adataset</span><br></pre></td></tr></table></figure><h2 id="2-3-数据转化"><a href="#2-3-数据转化" class="headerlink" title="2.3 数据转化"></a>2.3 数据转化</h2><p>下载的三个文件为三个压缩包，各自包含内容如下：</p><p><img src="https://s2.loli.net/2023/11/29/YMoD8WcQNp73LaZ.png" alt="Untitled 10.png"></p><p><img src="https://s2.loli.net/2023/11/29/Y8W6TLNyb9f4MPk.png" alt="Untitled 11.png"></p><p><img src="https://s2.loli.net/2023/11/29/Rl3tiHAvk6z2ZcC.png" alt="Untitled 12.png"></p><p>我们可以将压缩包里的三个文件提取到同一个文件夹 <strong>apolloDataSet</strong> 内：</p><p><img src="https://s2.loli.net/2023/11/29/reP9Nu54GsTyoCm.png" alt="Untitled 13.png"></p><p>在<strong>脚本代码scripts</strong>文件夹内中有apolloscape_to_records.py和apolloscape_to_kitti.py，即下面命令所用到的两个程序。</p><p>同时我们需要提前创建两个空文件夹train_records，kitti，用于存放下面转换的数据：</p><p><img src="https://s2.loli.net/2023/11/29/PuWlNEgSptJCvsX.png" alt="Untitled 14.png"></p><h3 id="2-3-1-使用apolloscape-to-records-py将apolloscape转化成apollo-records数据（需要在-apolloDataSet-文件夹打开终端）："><a href="#2-3-1-使用apolloscape-to-records-py将apolloscape转化成apollo-records数据（需要在-apolloDataSet-文件夹打开终端）：" class="headerlink" title="2.3.1 使用apolloscape_to_records.py将apolloscape转化成apollo records数据（需要在 apolloDataSet 文件夹打开终端）："></a>2.3.1 使用apolloscape_to_records.py将apolloscape转化成apollo records数据（需要在 <strong>apolloDataSet</strong> 文件夹打开终端）：</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"> <span class="comment"># -d表示apolloscape数据集。用a就好； </span></span><br><span class="line"><span class="comment"># -i表示数据数据集。这里目录就用训练集和测试集的目录； </span></span><br><span class="line"><span class="comment"># -o输出目录。注意：目录要提前创建好（上面内容已提示过）； # -t类型。用rcd就好。 </span></span><br><span class="line">python scripts/apolloscape_to_records.py -d=a -i=train/ -o=train_records/ -t=rcd</span><br></pre></td></tr></table></figure><p>注：如果报错ModuleNotFoundError: No module named ‘yaml’，执行下面命令安装后重新运行上面代码即可：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install pyyaml</span><br></pre></td></tr></table></figure><h3 id="2-3-2-Python2环境安装"><a href="#2-3-2-Python2环境安装" class="headerlink" title="2.3.2 Python2环境安装"></a>2.3.2 Python2环境安装</h3><p>（1）安装miniconda（anaconda的轻量版）</p><p>官网：<a href="https://docs.conda.io/projects/miniconda/en/latest/miniconda-install.html">https://docs.conda.io/projects/miniconda/en/latest/miniconda-install.html</a></p><p>或者参考我的教程：<a href="https://yang-makabaka.github.io/posts/4120ac2f.html">https://yang-makabaka.github.io/posts/4120ac2f.html</a></p><p>（2）使用conda创建环境</p><ul><li>创建python2环境：</li></ul><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda create -n python2 python=<span class="number">2.7</span></span><br></pre></td></tr></table></figure><ul><li>切换到python2环境：</li></ul><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">conda activate python2</span><br><span class="line"><span class="comment">#报错的话，用 source activate python2</span></span><br></pre></td></tr></table></figure><ul><li><p>安装pypcd和numpy：</p>  <figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">conda install numpy</span><br><span class="line"></span><br><span class="line">pip install pypcd</span><br></pre></td></tr></table></figure></li><li><p>然后就可以使用下面命令将apolloscape数据集转化成kitti格式。</p></li></ul><h3 id="2-3-3-使用apolloscape-to-kitti-py将apolloscape数据转化kitti格式，用于训练centerpoint模型（需要在-apolloDataSet-文件夹打开终端，且进入上面创建的环境python2）："><a href="#2-3-3-使用apolloscape-to-kitti-py将apolloscape数据转化kitti格式，用于训练centerpoint模型（需要在-apolloDataSet-文件夹打开终端，且进入上面创建的环境python2）：" class="headerlink" title="2.3.3 使用apolloscape_to_kitti.py将apolloscape数据转化kitti格式，用于训练centerpoint模型（需要在 apolloDataSet 文件夹打开终端，且进入上面创建的环境python2）："></a>2.3.3 使用apolloscape_to_kitti.py将apolloscape数据转化kitti格式，用于训练centerpoint模型（需要在 <strong>apolloDataSet</strong> 文件夹打开终端，且进入上面创建的环境python2）：</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#注：此步骤可在本地环境操作，不需要在容器中。本地需要具备pypcd库、numpy库和python2环境。</span></span><br><span class="line"><span class="comment"># --pcd_path: 点云数据路径，这里用的是pcl_pcd；</span></span><br><span class="line"><span class="comment"># --label_path: 标注结果。这里用的是detection_label；</span></span><br><span class="line"><span class="comment"># --output_path: 存放生成的数据，包括点云和标注两部分。</span></span><br><span class="line">python2 scripts/apolloscape_to_kitti.py --pcd_path=train/pcl_pcd/ --label_path=train/detection_label/ --output_path=./kitti</span><br></pre></td></tr></table></figure>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Perception-安装&quot;&gt;&lt;a href=&quot;#Perception-安装&quot; class=&quot;headerlink&quot; title=&quot;Perception 安装&quot;&gt;&lt;/a&gt;Perception 安装&lt;/h1&gt;&lt;p&gt;百度Apollo自动驾驶仿真平台9.0版本Perce</summary>
      
    
    
    
    
    <category term="Apollo" scheme="https://yang-makabaka.github.io/tags/Apollo/"/>
    
  </entry>
  
  <entry>
    <title>Planning介绍</title>
    <link href="https://yang-makabaka.github.io/posts/8f3d9018.html"/>
    <id>https://yang-makabaka.github.io/posts/8f3d9018.html</id>
    <published>2023-11-29T04:08:50.000Z</published>
    <updated>2023-11-29T04:10:26.380Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Planning-介绍"><a href="#Planning-介绍" class="headerlink" title="Planning 介绍"></a>Planning 介绍</h1><p>百度Apollo自动驾驶仿真平台9.0版本Planning模块相关内容</p><h1 id="概述"><a href="#概述" class="headerlink" title="概述"></a>概述</h1><h2 id="1-运行流程"><a href="#1-运行流程" class="headerlink" title="1 运行流程"></a>1 运行流程</h2><p>如下图所示，Planning模块的上游是Localization, Prediction, Routing模块，而下游是Control模块。Routing模块先规划出一条导航线路，然后Planning模块根据这条线路做局部优化，如果Planning模块发现短期规划的线路行不通（比如前面修路，或者错过了路口），会触发Routing模块重新规划线路，因此这两个模块的数据流是双向的。</p><p><img src="https://s2.loli.net/2023/11/29/x38NH6bMGyrZUcn.png" alt="Untitled.png"></p><h2 id="2-原理"><a href="#2-原理" class="headerlink" title="2 原理"></a>2 原理</h2><p>Apollo 规划模块功能的实现是基于<strong>场景（scenario-based）</strong>实现的，针对不同的场景，规划模块通过一系列独立的 <strong>任务（task）</strong> 组合来完成轨迹的规划。开发者可以根据自己的使用需求，调整支持运行的场景列表，或者场景中支持的任务类型来满足自己的需求。</p><p><img src="https://s2.loli.net/2023/11/29/dnxJK8kGhZeLAR2.png" alt="Untitled 1.png"></p><p>Apollo 规划架构示意图如上，其中部分重要模块如下：</p><ul><li>状态机（Apollo FSM（Finite State Machine））：一个有限状态机，结合导航、环境等信息确定自动驾驶车辆的驾驶场景</li><li>规划分发器（Planning Dispatcher）：根据状态机与车辆相关信息，调用合适当前场景的规划器</li><li>规划器（Planner）：结合上游模块信息，通过一系列的任务组合，完成自动驾驶车辆的轨迹规划</li><li>决策器 &amp; 优化器（Deciders &amp; Optimizers）：一组实现决策和优化任务的 task 集合。优化器用于优化车辆的轨迹和速度。决策器则基于规则，确定自动驾驶车辆何时换车道、何时停车、何时蠕行（慢速行进）或蠕行何时完成等驾驶行为。</li></ul><h2 id="3-功能列表"><a href="#3-功能列表" class="headerlink" title="3 功能列表"></a>3 <strong>功能列表</strong></h2><div class="table-container"><table><thead><tr><th>功能名称</th><th>功能描述</th><th>功能相关代码包</th></tr></thead><tbody><tr><td>lane follow</td><td>车辆沿指令中的路由线路行驶，从地图中查询路由中的车道信息，规划沿车道线行驶的轨迹</td><td>LaneFollowScenarioLaneFollowPath</td></tr><tr><td>nudge</td><td>如果道路前方有静止或低速障碍物占据车道，但当前车道内还有足够空间，车辆可以在当前车道内绕过障碍物行驶。</td><td>LaneFollowScenarioLaneFollowPath</td></tr><tr><td>lane change</td><td>车辆沿RoutingResponse中的路由线路行驶的过程中，从一个车道切换到相邻车道。</td><td>LaneFollowScenarioLaneChangePath</td></tr><tr><td>lane borrow</td><td>如果道路前方有障碍物长时间停留阻塞道路，车辆无法通过在当前车道内绕过，需要往相邻车道借道，绕过当前障碍物。当车辆经过障碍物之后，车辆会立即回到原车道行驶。</td><td>LaneFollowScenarioLaneBorrowPath</td></tr><tr><td>pull over</td><td>当车辆接近终点时，可以通过配置选择是否在终点处靠边停车。如果使能终点靠边停车，车辆在终点附近查找一个可以停车的位置，并将车辆停在这个位置上。如果这个位置前后有其他障碍物，车辆可以通过OpenSpace的泊车算法，将车辆停在这个位置。</td><td>PullOverScenarioPullOverPath</td></tr><tr><td>park and go</td><td>如果车辆停车位置不在道路上，再次启动的时候，车辆会先从当前位置使用OpenSpace规划算法（如有必要）先行驶到车道线上，然后再正常沿道路行驶。</td><td>ParkAndGoScenario</td></tr><tr><td>crosswalk</td><td>当车辆行驶到人行道前时，如果有行人通过，车辆会停车等待行人通过后再通行。</td><td>Crosswalk</td></tr><tr><td>bare intersection</td><td>车辆行驶到无交通灯和停止标志的交通路口，因为对向车辆没有明确通行指示，所以需要车辆根据路口交通情况决定是否通行。</td><td>BareIntersectionUnprotectedScenario</td></tr><tr><td>traffic light protected/unprotected</td><td>车辆经过有红绿灯的交通灯路口时，如果交通灯有左转/右转通行箭头，车辆在红灯亮起时停止，绿灯亮起时通行；如果交通灯不是箭头指示灯，车辆在通过路口时可能还有对向车辆经过，这时就需要在通过路口前减速慢行，没有冲突时再通过路口。</td><td>TrafficLightTrafficLightProtectedScenarioTrafficLightUnprotectedLeftTurnScenarioTrafficLightUnprotectedRightTurnScenario</td></tr><tr><td>stop sign</td><td>当车辆前方有停止标志时，先停车观察，没有其他行人或车辆冲突时再通行。</td><td>StopSignStopSignUnprotectedScenario</td></tr><tr><td>yield sign</td><td>当车辆在没有交通灯的路口，有让行标志时，优先让其他对向车辆通行后自车再通行。</td><td>YieldSignYieldSignScenario</td></tr><tr><td>keep clear area</td><td>车辆经过Keep Clear Area区域时，不能在这个区域内停车。</td><td>KeepClear</td></tr><tr><td>rerouting</td><td>如果车辆在道路上被阻塞超出一段时间后，planning发出重新路由的请求以便脱困。</td><td>Rerouting</td></tr><tr><td>valet parking</td><td>给定地图上某一个停车位的id，车辆从当前位置导航到停车位，并泊车入库。</td><td>ValetParkingScenario</td></tr><tr><td>emergency pull over</td><td>在车辆行驶过程中，可以接收外部命令紧急靠边停车。</td><td>EmergencyPullOverScenario</td></tr></tbody></table></div><h2 id="4-最新9-0更新特性"><a href="#4-最新9-0更新特性" class="headerlink" title="4 最新9.0更新特性"></a>4 最新9.0更新特性</h2><h3 id="（1）接口升级"><a href="#（1）接口升级" class="headerlink" title="（1）接口升级"></a>（1）接口升级</h3><p>在新版本中对这些接口进行了优化和升级：</p><ul><li>统一梳理和封装，调用接口时，命令统一转发到”ExternalCommandProcessor”模块，通过封装，当PNC内部模块接口升级时，可以保持外部命令接口不变。</li><li>改用cyber中service-client机制调用，用户可以通过client查询当前任务的执行状态。</li><li>对RoutingRequest的导航命令做了精简：<ul><li>原来的导航命令需要查询地图，找到路由点和终点最近的车道，并得到在车道上对应的投影点；精简后的命令只需要给出坐标和朝向即可。</li><li>发送导航命令不再需要发送车辆当前的位置作为起点位置，PNC会自动获取并处理起点位置。</li></ul></li></ul><p>升级后的命令数据流程如下图：</p><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/d97beb0b78343b224bb30c5e58c5a8ee83692a58">https://apollo-studio-public.bj.bcebos.com/community/article/image/d97beb0b78343b224bb30c5e58c5a8ee83692a58</a></p><p>升级前后命令功能保持不变，对照关系如下表所示：</p><div class="table-container"><table><thead><tr><th>功能</th><th>升级前命令</th><th>升级后命令</th><th>升级说明</th></tr></thead><tbody><tr><td>点到点沿道路行驶</td><td>routing::RoutingRequest</td><td>LaneFollowCommand</td><td>精简了路由点信息，新的命令给出坐标和朝向，不需要查询地图找到最近的LaneWayPoint</td></tr><tr><td>泊车</td><td>routing::RoutingRequest(包含parking_space_id)</td><td>ValetParkingCommand</td><td>升级前后都是给定parking_space_id进行泊车</td></tr><tr><td>PULL_OVER,START,STOP流程控制</td><td>planning::PadMessage</td><td>ActionCommand</td><td>升级后合并流程操作到一个命令中</td></tr><tr><td>切换自动/手动模式</td><td>control::PadMessage</td><td>ActionCommand</td><td>升级后合并流程操作到一个命令中</td></tr></tbody></table></div><p>升级后的接口有以下几个优点：</p><ul><li>命令调用更清晰简便，新的导航接口精简了数据，用户只需要设置必要的坐标和朝向信息即可。</li><li>使用service/client的调用方式，新的接口可以通过client获取命令执行的状态，查看命令是正在执行中，已经完成或有错误发生。</li><li>新的接口支持用户自定义扩展自己的命令。</li></ul><h3 id="（2）插件化"><a href="#（2）插件化" class="headerlink" title="（2）插件化"></a>（2）插件化</h3><p>插件是新版本中的支持用户灵活扩展新功能的一种方式，用户新扩展的插件符合父类程序接口规范，通过重写接口的实现来增加新的功能，插件以独立包的方式发布。</p><p>在planning中主要对scenario，task和traffic rules进行了插件化，用户可以根据场景需要，自定义添加自己的场景，任务或交通规则，具体插件添加的方式后续文档中有详细的介绍。</p><p>例如用户新增左转待转场景插件，增加一个包left_turn_waiting_zone，在这个包中添加左转待转场景的实现代码，以及相关的工程文件，编译调试后发布即可。</p><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/3b6c4db95d33bc17735ecd063023099c41c9de11">https://apollo-studio-public.bj.bcebos.com/community/article/image/3b6c4db95d33bc17735ecd063023099c41c9de11</a></p><p>需要运行这个场景时，在planning的配置文件中，添加这个场景的pipeline：</p><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/68c3e9553caa9e10281a0231d408879592c7d56e">https://apollo-studio-public.bj.bcebos.com/community/article/image/68c3e9553caa9e10281a0231d408879592c7d56e</a></p><p>旧版本中不使用插件的方式，用户新增一个scenario，task，traffic rules，需要修改planning component流程代码，用于创建新的类型对象，添加新增对象的流程调用，修改proto文件等。这样的问题一个是修改处较多，修改过程繁琐；另外就是当用户在planning中增加了一个新的scenario，task，traffic rules时，后续apollo升级时，用户无法直接跟着升级，需要手动merge自己修改的代码。</p><p>使用插件的方式扩展scenario，task，traffic rules，可以实现：</p><ul><li>用户根据自己的场景，使用包管理的方式，选择性下载安装自己需要的scenario，task，traffic rules即可。</li><li>用户新增的插件独立开发，编译，发布和运行。</li><li>用户新增了插件后，可以直接跟随apollo一起升级。</li></ul><h3 id="（3）参数配置升级"><a href="#（3）参数配置升级" class="headerlink" title="（3）参数配置升级"></a>（3）参数配置升级</h3><p>planning中的配置参数量较大，入门调试时难度较高，用户想要修改的功能对应的参数不直观，并且不易快速定位需要修改的参数位置。针对这些问题，对配置参数进行了以下调整：</p><ul><li>将参数分成全局变量和局部变量，全局变量是多个算法或插件中共同使用的参数；局部变量是专属于某个算法或插件的参数。如果用户需要调整某个插件的参数，直接在插件的目录中查找。</li></ul><p>planning的全部变量在planning/planning_base/conf目录下：</p><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/41ccd3fd662882ffc3cae2d94992347e5cd7d7bb">https://apollo-studio-public.bj.bcebos.com/community/article/image/41ccd3fd662882ffc3cae2d94992347e5cd7d7bb</a></p><p>planning的局部变量在插件自身的目录下，如lane_change_path这个Task插件的参数：</p><p><a href="https://apollo-studio-public.bj.bcebos.com/community/article/image/2798392db33adc3c4a10f313e2b50887cefe8d00">https://apollo-studio-public.bj.bcebos.com/community/article/image/2798392db33adc3c4a10f313e2b50887cefe8d00</a></p><p>添加对常用功能使用到的参数的说明文档，方便用户调试时查询。对参数目录进行重新梳理和作用范围的划分，有以下优点：</p><ul><li>参数的目录跟随作用范围和功能，这样对参数的定位更清晰和直观。</li><li>用户新增的插件所使用的参数，可以跟随插件进行发布和管理。</li></ul>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Planning-介绍&quot;&gt;&lt;a href=&quot;#Planning-介绍&quot; class=&quot;headerlink&quot; title=&quot;Planning 介绍&quot;&gt;&lt;/a&gt;Planning 介绍&lt;/h1&gt;&lt;p&gt;百度Apollo自动驾驶仿真平台9.0版本Planning模块相关内</summary>
      
    
    
    
    
    <category term="Apollo" scheme="https://yang-makabaka.github.io/tags/Apollo/"/>
    
  </entry>
  
  <entry>
    <title>交汇路口减速慢行</title>
    <link href="https://yang-makabaka.github.io/posts/undefined.html"/>
    <id>https://yang-makabaka.github.io/posts/undefined.html</id>
    <published>2023-11-29T04:00:27.000Z</published>
    <updated>2023-11-29T04:02:45.203Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Planning-交汇路口减速慢行"><a href="#Planning-交汇路口减速慢行" class="headerlink" title="Planning 交汇路口减速慢行"></a>Planning 交汇路口减速慢行</h1><p>百度Apollo自动驾驶仿真平台9.0版本Planning模块相关内容</p><h2 id="场景介绍：主车在城市道路行驶时，行驶至交汇路口时需降低速度至5米-秒，并在过后恢复正常速度。"><a href="#场景介绍：主车在城市道路行驶时，行驶至交汇路口时需降低速度至5米-秒，并在过后恢复正常速度。" class="headerlink" title="场景介绍：主车在城市道路行驶时，行驶至交汇路口时需降低速度至5米/秒，并在过后恢复正常速度。"></a>场景介绍：主车在城市道路行驶时，行驶至交汇路口时需降低速度至5米/秒，并在过后恢复正常速度。</h2><p>通过打开“Junction”显示按钮，得知该路口（下图中蓝色框内区域）为junction的道路类型。</p><p><img src="https://s2.loli.net/2023/11/29/BdJVYmQnquWLj2e.png" alt="Untitled.png"></p><h1 id="1-新增-traffic-rule-插件"><a href="#1-新增-traffic-rule-插件" class="headerlink" title="1.新增  traffic rule 插件"></a>1.新增  traffic rule 插件</h1><h3 id="按下面链接完成"><a href="#按下面链接完成" class="headerlink" title="按下面链接完成"></a>按下面链接完成</h3><p><strong><a href="https://apollo.baidu.com/community/article/1121">https://apollo.baidu.com/community/article/1121</a></strong></p><h1 id="2-代码修改"><a href="#2-代码修改" class="headerlink" title="2.代码修改"></a>2.代码修改</h1><h3 id="（1）pnc-junction-overlaps改为junction-overlaps"><a href="#（1）pnc-junction-overlaps改为junction-overlaps" class="headerlink" title="（1）pnc_junction_overlaps改为junction_overlaps"></a><strong>（1）pnc_junction_overlaps改为junction_overlaps</strong></h3><p><strong>modules/planning/trafﬁc_rules/region_speed_limit/region_speed_limit.cc</strong></p><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">原： <span class="keyword">const</span> <span class="attr">std</span>::vector&lt;<span class="title class_">PathOverlap</span>&gt; &amp; pnc_junction_overlaps </span><br><span class="line">= reference_line_info-&gt; reference_line ( ) . map_path ( ) . pnc_junction_overlaps ( ) ; </span><br><span class="line"></span><br><span class="line">修改为： <span class="keyword">const</span> <span class="attr">std</span>::vector&lt;<span class="title class_">PathOverlap</span>&gt; &amp; pnc_junction_overlaps </span><br><span class="line">= reference_line_info-&gt; reference_line ( ) . map_path ( ) . junction_overlaps () ;</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/FNIAMkVvmcjxb4z.png" alt="Untitled 1.png"></p><h3 id="（2）TrafficRule改为apollo-planning-TrafficRule"><a href="#（2）TrafficRule改为apollo-planning-TrafficRule" class="headerlink" title="（2）TrafficRule改为apollo::planning::TrafficRule"></a><strong>（2）TrafficRule改为apollo::planning::TrafficRule</strong></h3><p><strong>modules/planning/traffic_rules/region_speed_limit/plugin_region_speed_limit_description.xml</strong></p><p><img src="https://s2.loli.net/2023/11/29/oY3X2WCize1gZST.png" alt="Untitled 2.png"></p><p>注：修改后不要忘了保存</p><h3 id="（3）编译"><a href="#（3）编译" class="headerlink" title="（3）编译"></a>（3）编译</h3><p>在打开dreamview的终端执行下面代码</p><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">buildtool build -p modules/planning/traffic_rules/region_speed_limit/</span><br></pre></td></tr></table></figure><h3 id="4-调参"><a href="#4-调参" class="headerlink" title="(4)调参"></a>(4)调参</h3><p>将 <strong>limit_speed</strong> 由 <strong>15.0</strong> 改为 <strong>3.0</strong></p><p>modules/planning/traffic_rules/region_speed_limit/conf/region_speed_limit.pb.txt</p><p><img src="https://s2.loli.net/2023/11/29/CsbAzlVgQqhceu9.png" alt="Untitled 3.png"></p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Planning-交汇路口减速慢行&quot;&gt;&lt;a href=&quot;#Planning-交汇路口减速慢行&quot; class=&quot;headerlink&quot; title=&quot;Planning 交汇路口减速慢行&quot;&gt;&lt;/a&gt;Planning 交汇路口减速慢行&lt;/h1&gt;&lt;p&gt;百度Apollo自动</summary>
      
    
    
    
    
    <category term="Apollo" scheme="https://yang-makabaka.github.io/tags/Apollo/"/>
    
  </entry>
  
  <entry>
    <title>Perception训练与部署——环境配置</title>
    <link href="https://yang-makabaka.github.io/posts/1cc8f882.html"/>
    <id>https://yang-makabaka.github.io/posts/1cc8f882.html</id>
    <published>2023-11-29T03:54:49.000Z</published>
    <updated>2023-11-29T03:59:10.966Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Perception-训练与部署——环境配置"><a href="#Perception-训练与部署——环境配置" class="headerlink" title="Perception 训练与部署——环境配置"></a>Perception 训练与部署——环境配置</h1><p>百度Apollo自动驾驶仿真平台9.0版本Perception模块相关内容</p><h1 id="1-首先安装nvidia驱动"><a href="#1-首先安装nvidia驱动" class="headerlink" title="1. 首先安装nvidia驱动"></a>1. 首先安装nvidia驱动</h1><h1 id="2-安装MiniConda"><a href="#2-安装MiniConda" class="headerlink" title="2. 安装MiniConda"></a>2. 安装MiniConda</h1><p>说明：如果已安装Anaconda则无需再安装Miniconda。</p><p>官网：<a href="https://docs.conda.io/en/latest/miniconda.html#linux-installers">MiniConda安装教程</a></p><p>或者参考我的教程：<a href="https://yang-makabaka.github.io/posts/4120ac2f.html">https://yang-makabaka.github.io/posts/4120ac2f.html</a></p><h1 id="3-安装PaddlePaddle"><a href="#3-安装PaddlePaddle" class="headerlink" title="3. 安装PaddlePaddle"></a>3. 安装PaddlePaddle</h1><h2 id="3-1-创建并进入-conda-虚拟环境"><a href="#3-1-创建并进入-conda-虚拟环境" class="headerlink" title="3.1. 创建并进入 conda 虚拟环境"></a>3.1. 创建并进入 conda 虚拟环境</h2><p>注：每一步都需要在虚拟环境中进行。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda create -n paddle_env python=3.8</span><br></pre></td></tr></table></figure><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">source activate paddle_env</span><br></pre></td></tr></table></figure><h2 id="3-2-本地安装cuda和cudnn"><a href="#3-2-本地安装cuda和cudnn" class="headerlink" title="3.2. 本地安装cuda和cudnn"></a>3.2. 本地安装cuda和cudnn</h2><p>注：需要在上面创建的环境paddle_env中安装，如果没有进入，执行source activate paddle_env<br>执行下面命令,下图所示版本CUDA Version：12.0，需要注意的是此处显示的是当前驱动支持的cuda最大版本：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">nvidia-smi</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/x5AwSy8aoMBDlGU.png" alt="Untitled.png"></p><h3 id="3-2-1-查看conda支持的cuda版本"><a href="#3-2-1-查看conda支持的cuda版本" class="headerlink" title="3.2.1. 查看conda支持的cuda版本"></a><strong>3.2.1. 查看conda支持的cuda版本</strong></h3><p>执行以下命令，查找源内所有cuda版本及对应下载地址：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda search cudatoolkit --info</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/8B1hjwN6KG9kXD3.png" alt="Untitled 1.png"></p><h3 id="3-2-2-下载cuda"><a href="#3-2-2-下载cuda" class="headerlink" title="3.2.2. 下载cuda"></a><strong>3.2.2. 下载cuda</strong></h3><p>找到自己想要的cuda版本后（以11.7.0版本为例），复制url字段里的下载链接，执行如下代码下载：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#wget 你刚刚复制的链接地址</span></span><br><span class="line">wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/linux-<span class="number">64</span>/cudatoolkit-<span class="number">11.7</span><span class="number">.0</span>-hd8887f6_11.conda</span><br></pre></td></tr></table></figure><h3 id="3-2-3-安装cuda"><a href="#3-2-3-安装cuda" class="headerlink" title="3.2.3. 安装cuda"></a><strong>3.2.3. 安装cuda</strong></h3><p>执行如下命令进行安装，因为是通过本地安装的，所以需要写明本地包的路径(上一步不改变下载路径的话，会默认下载到主文件夹，此时路径只写文件名即可):</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#conda install --use-local 本地cuda包所在的路径</span></span><br><span class="line">conda install --use-local cudatoolkit-<span class="number">11.7</span><span class="number">.0</span>-hd8887f6_10.tar.bz2</span><br></pre></td></tr></table></figure><h3 id="3-2-4-查看cuda对应的cudnn版本"><a href="#3-2-4-查看cuda对应的cudnn版本" class="headerlink" title="3.2.4. 查看cuda对应的cudnn版本"></a><strong>3.2.4. 查看cuda对应的cudnn版本</strong></h3><p>由下图可知，paddle支持的 CUDA 11.7 对应的 cuDNN 版本为 v8.4.1 :</p><p><img src="https://s2.loli.net/2023/11/29/dYrTShmHMjVuwiC.png" alt="Untitled 2.png"></p><p>使用如下命令查看conda支持的cudnn版本，注意cudnn的版本一定要和刚刚下载的cuda版本对应：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda search cudnn --info</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/FqA7h1Vjfm3g8co.png" alt="Untitled 3.png"></p><h3 id="3-2-5-下载cudnn版本"><a href="#3-2-5-下载cudnn版本" class="headerlink" title="3.2.5. 下载cudnn版本"></a><strong>3.2.5. 下载cudnn版本</strong></h3><p>同样复制链接，使用wget命令下载：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/linux-<span class="number">64</span>/cudnn-<span class="number">8.4</span><span class="number">.1</span><span class="number">.50</span>-hed8a83a_0.tar.bz2</span><br></pre></td></tr></table></figure><h3 id="3-2-6-安装cudnn"><a href="#3-2-6-安装cudnn" class="headerlink" title="3.2.6. 安装cudnn"></a><strong>3.2.6. 安装cudnn</strong></h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#conda install --use-local 本地cudnn包所在的路径</span></span><br><span class="line">conda install --use-local cudnn-<span class="number">8.4</span><span class="number">.1</span><span class="number">.50</span>-hed8a83a_0.tar.bz2</span><br></pre></td></tr></table></figure><h2 id="3-3-安装GPU版的PaddlePaddle"><a href="#3-3-安装GPU版的PaddlePaddle" class="headerlink" title="3.3. 安装GPU版的PaddlePaddle"></a>3.3. 安装GPU版的PaddlePaddle</h2><p>注：需要在上面创建的环境paddle_env中安装，如果没有进入，执行source activate paddle_env</p><h3 id="3-3-1-官网获取安装指令："><a href="#3-3-1-官网获取安装指令：" class="headerlink" title="3.3.1. 官网获取安装指令："></a><strong>3.3.1. 官网获取安装指令：</strong></h3><p><a href="https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html">https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html</a></p><p>如下图所示，选择对应版本(以11.7版本为例)后，复制安装信息给出的指令到终端执行，等待安装完成：</p><p><img src="https://s2.loli.net/2023/11/29/f3Xzqjls4JuLeyP.png" alt="Untitled 4.png"></p><p>注：执行命令报错怎么办（<strong>ValueError: Trusted host URL must include a host part: ‘#‘</strong>）：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">##执行下面指令编辑pip配置文件：</span></span><br><span class="line">vi ~/.pip/pip.conf</span><br><span class="line"></span><br><span class="line"><span class="comment">#会看到下面类似内容，第四行带有 # 注释，将注释语句删除即可：</span></span><br><span class="line"><span class="number">1</span> [<span class="keyword">global</span>]</span><br><span class="line"><span class="number">2</span> index-url = https://pypi.tuna.tsinghua.edu.cn/simple</span><br><span class="line"><span class="number">3</span> [install]</span><br><span class="line"><span class="number">4</span> trusted-host = https://pypi.tuna.tsinghua.edu.cn  <span class="comment"># trusted-host 此参数是为了避免麻烦，否则使用的</span></span><br><span class="line"></span><br><span class="line"><span class="comment">#修改后是这样的</span></span><br><span class="line"><span class="number">1</span> [<span class="keyword">global</span>]</span><br><span class="line"><span class="number">2</span> index-url = https://pypi.tuna.tsinghua.edu.cn/simple</span><br><span class="line"><span class="number">3</span> [install]</span><br><span class="line"><span class="number">4</span> trusted-host = https://pypi.tuna.tsinghua.edu.cn</span><br><span class="line"></span><br><span class="line"><span class="comment">#附修改操作指令</span></span><br><span class="line"><span class="comment">#“i” 开始编辑，此时可通过方向键移动光标修改内容  </span></span><br><span class="line"><span class="comment">#“Esc” 退出编辑  </span></span><br><span class="line"><span class="comment">#“:wq” 保存退出</span></span><br></pre></td></tr></table></figure><h3 id="3-3-2-验证安装"><a href="#3-3-2-验证安装" class="headerlink" title="3.3.2. 验证安装"></a><strong>3.3.2. 验证安装</strong></h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"># 输入python进入python解释器</span><br><span class="line">python</span><br><span class="line"></span><br><span class="line"># 在python解释器中输入</span><br><span class="line">import paddle</span><br><span class="line"></span><br><span class="line"># 再输入</span><br><span class="line">paddle.utils.run_check()</span><br><span class="line"></span><br><span class="line"># 如果出现PaddlePaddle is installed successfully!，说明安装成功。</span><br><span class="line"></span><br><span class="line"># 输入quit()退出python解释器</span><br><span class="line">quit()</span><br></pre></td></tr></table></figure><p>注：如果没有成功：</p><p>（1）查看路径</p><p>教程是非root用户创建的环境，命名为paddle_env，环境路径为~/.conda/envs/paddle_env，对应的第三方动态链接库地址为~/.conda/envs/paddle_env/lib，根据建立的环境名称，对应的路径为~/.conda/envs/[虚拟环境名称]/lib<br>若不清楚安装路径，可进入环境 paddle_env ，运行：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">python -c <span class="string">&quot;import paddle; print(paddle.**file**)&quot;</span></span><br></pre></td></tr></table></figure><pre><code>  输出安装路径，结果为：</code></pre><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/home/ubuntu/.conda/envs/paddle_env/lib/python3<span class="number">.8</span>/site-packages/paddle/**init**.py</span><br></pre></td></tr></table></figure><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">对应的路径为</span><br><span class="line">/home/ubuntu/.conda/envs/paddle_env/lib</span><br><span class="line">或</span><br><span class="line">~/.conda/envs/paddle_env/lib</span><br><span class="line"><span class="comment">#建议使用相对路径</span></span><br></pre></td></tr></table></figure><p>（2）添加环境变量</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#临时方案</span></span><br><span class="line"><span class="comment">#每次在程序运行前设置环境变量</span></span><br><span class="line">export LD_LIBRARY_PATH=~/.conda/envs/paddle_env/lib</span><br><span class="line">python xxx.py</span><br><span class="line"></span><br><span class="line"><span class="comment">#永久方案（推荐）</span></span><br><span class="line"><span class="comment">#将环境变量添加到~/.bashrc文件</span></span><br><span class="line">echo <span class="string">&quot;export LD_LIBRARY_PATH=~/.conda/envs/paddle_env/lib&quot;</span>&gt;&gt;~/.bashrc</span><br><span class="line"><span class="comment">#添加后需要关闭终端重新打开或者登录。</span></span><br><span class="line"></span><br></pre></td></tr></table></figure><h1 id="4-安装Paddle3D"><a href="#4-安装Paddle3D" class="headerlink" title="4. 安装Paddle3D"></a>4. <strong>安装Paddle3D</strong></h1><h2 id="4-1-下载Paddle3D源码"><a href="#4-1-下载Paddle3D源码" class="headerlink" title="4.1. 下载Paddle3D源码"></a>4.1. 下载Paddle3D源码</h2><p>建议在之前创建的数据文件夹 apolloDataSet 内打开终端下载，完成后会生成 Paddle3D 文件夹：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git clone https://github.com/PaddlePaddle/Paddle3D</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/nIZV8ewhrkvNcF1.png" alt="Untitled 5.png"></p><h2 id="4-2-安装Paddle3D依赖"><a href="#4-2-安装Paddle3D依赖" class="headerlink" title="4.2. 安装Paddle3D依赖"></a>4.2. 安装Paddle3D依赖</h2><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"># 切换到Paddle3D文件夹</span><br><span class="line">cd Paddle3D</span><br><span class="line"></span><br><span class="line">#确保已进入虚拟环境paddle_env(已进入则忽略)</span><br><span class="line">#source activate paddle_env</span><br><span class="line"></span><br><span class="line"># 安装requirements.txt中要求的软件包</span><br><span class="line">pip install -r requirements.txt</span><br></pre></td></tr></table></figure><h2 id="4-3-安装Paddle3D"><a href="#4-3-安装Paddle3D" class="headerlink" title="4.3. 安装Paddle3D"></a>4.3. 安装Paddle3D</h2><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 以编辑模式安装</span></span><br><span class="line">pip install -e .</span><br></pre></td></tr></table></figure>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Perception-训练与部署——环境配置&quot;&gt;&lt;a href=&quot;#Perception-训练与部署——环境配置&quot; class=&quot;headerlink&quot; title=&quot;Perception 训练与部署——环境配置&quot;&gt;&lt;/a&gt;Perception 训练与部署——环境</summary>
      
    
    
    
    
    <category term="Apollo" scheme="https://yang-makabaka.github.io/tags/Apollo/"/>
    
  </entry>
  
  <entry>
    <title>Planning车辆故障绕行</title>
    <link href="https://yang-makabaka.github.io/posts/33ff5d2.html"/>
    <id>https://yang-makabaka.github.io/posts/33ff5d2.html</id>
    <published>2023-11-29T02:21:22.000Z</published>
    <updated>2023-11-29T03:48:02.301Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Planning-车辆故障绕行"><a href="#Planning-车辆故障绕行" class="headerlink" title="Planning 车辆故障绕行"></a>Planning 车辆故障绕行</h1><p>百度Apollo自动驾驶仿真平台9.0版本Planning模块相关内容</p><h2 id="场景介绍：主车在城市道路行驶时，当遇到前方的故障车辆，为了保证行驶安全应及时减速并绕行，确保与障碍物之间的横向距离至少为1米，并控制绕行速度在5米-秒以内。"><a href="#场景介绍：主车在城市道路行驶时，当遇到前方的故障车辆，为了保证行驶安全应及时减速并绕行，确保与障碍物之间的横向距离至少为1米，并控制绕行速度在5米-秒以内。" class="headerlink" title="场景介绍：主车在城市道路行驶时，当遇到前方的故障车辆，为了保证行驶安全应及时减速并绕行，确保与障碍物之间的横向距离至少为1米，并控制绕行速度在5米/秒以内。"></a>场景介绍：主车在城市道路行驶时，当遇到前方的故障车辆，为了保证行驶安全应及时减速并绕行，确保与障碍物之间的横向距离至少为1米，并控制绕行速度在5米/秒以内。</h2><p>按照以下四步进行，不可跳过任何一步。</p><h2 id="一、DreamView使用步骤"><a href="#一、DreamView使用步骤" class="headerlink" title="一、DreamView使用步骤"></a>一、DreamView使用步骤</h2><p>（1）在工程文件夹 ”application-pnc” 打开终端</p><p>（2）aem start</p><p>aem enter</p><p>aem bootstrap start</p><p>（3）运行成功后，在浏览器地址栏输入 “<a href="http://localhost:8888/”">http://localhost:8888/”</a></p><p><img src="https://s2.loli.net/2023/11/29/htwpnxQZDS98RcB.png" alt="Untitled.png"></p><p>（4）模式选择 Mkz Standard Debug，地图选择Apollo Virutal Map，打开Sim_Control模式，打开PNC Monitor，等待屏幕中间区域出现Mkz车辆模型和地图后即表示成功进入仿真模式。</p><p>（5）点击左侧Tab栏Module Controller，启动Planning，Prediction模块。</p><p>（6）打开场景 点击左侧Profile栏，选择创建的比赛场景，此时屏幕右上角多出选择场景的一栏，选择相应场景，车辆便会按场景预定路线行驶。</p><p><img src="https://s2.loli.net/2023/11/29/Qmv2SFxCNypHLsB.png" alt="Untitled 1.png"></p><p>注：有时Planning，Prediction模块会自己关闭或者选择场景后车辆不动，再次打开模块，重新选择场景即可。</p><h2 id="二、配置参数同步"><a href="#二、配置参数同步" class="headerlink" title="二、配置参数同步"></a>二、配置参数同步</h2><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">**输⼊全局配置参数同步指令，系统将⾃动将全局配置参数复制到proﬁle的<span class="keyword">default</span>⽬录中，然后就可以在proﬁle⽬录上轻松修改配置参数**</span><br><span class="line">**buildtool profile config init --package planning --profile=<span class="keyword">default</span></span><br><span class="line">buildtool profile config init --package planning-task-speed-bounds-decider --profile=<span class="keyword">default</span></span><br><span class="line"></span><br><span class="line">使用<span class="keyword">default</span>这份参数配置</span><br><span class="line">aem profile use <span class="keyword">default</span>**</span><br></pre></td></tr></table></figure><h2 id="三、代码修改"><a href="#三、代码修改" class="headerlink" title="三、代码修改"></a>三、代码修改</h2><h3 id="1-修改全局配置参数："><a href="#1-修改全局配置参数：" class="headerlink" title="1.修改全局配置参数："></a>1.修改全局配置参数：</h3><p>在 proﬁles/default/modules/planning/planning_base/conf/ 中找到planning.conf文件（推荐用vscode打开）。</p><p>（1）添加横向缓冲距离参数</p><p>在代码中添加下面代码段(为什么直接添加，因为在别的地方的绕行文件中已经有关于这个参数的配置，但是如果在这里加入这个参数配置的话，工程文件会优先使用此处的参数值，所以直接在此处，且更加方便直观)</p><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">**--obstacle_lat_buffer=<span class="number">1.5</span>**</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/HvmZ2zrKCqApXMV.png" alt="Untitled 2.png"></p><p>（2）修改全局速度和默认速度</p><p>将文件中的参数值（不用添加，文件中已有）修改为下面显示的值</p><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">**--planning_upper_speed_limit=<span class="number">11.18</span></span><br><span class="line">--default_cruise_speed=<span class="number">11.18</span>**</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/GyOM7tisQohkCqB.png" alt="Untitled 3.png"></p><p>注意：被“#”注释的配置参数是不起作用的，如需使用应取消注释（快捷键Ctrl+/）</p><p><img src="https://s2.loli.net/2023/11/29/2iu6XwdMUxGrA9J.png" alt="Untitled 4.png"></p><h3 id="2-修改局部配置参数"><a href="#2-修改局部配置参数" class="headerlink" title="2.修改局部配置参数"></a>2.修改局部配置参数</h3><p>在proﬁles/default/modules/planning/tasks/speed_bounds_decider/conf/下打开 default_conf.pb.txt 文件。</p><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">#注：以下为参数修改内容，但在实际实验中修改并没有效果，可能是某个地方中英文符号有问题，看看知道修改哪里就行</span><br><span class="line">**# 将<span class="attr">static_obs_nudge_speed_ratio</span>: <span class="number">0.6</span>改为<span class="number">0.3</span></span><br><span class="line"><span class="attr">static_obs_nudge_speed_ratio</span>: <span class="number">0.3</span></span><br><span class="line"></span><br><span class="line">#加上collision_safety_range配置⽂件</span><br><span class="line"><span class="attr">collision_safety_range</span>: <span class="number">5.0</span>**</span><br></pre></td></tr></table></figure><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">#注：这是需要做的，直接将下面代码替换到default_conf.<span class="property">pb</span>.<span class="property">txt</span> 文件里即可，如下图。</span><br><span class="line"><span class="attr">total_time</span>: <span class="number">7.0</span></span><br><span class="line"><span class="attr">boundary_buffer</span>: <span class="number">0.25</span></span><br><span class="line"><span class="attr">max_centric_acceleration_limit</span>: <span class="number">0.8</span></span><br><span class="line"><span class="attr">point_extension</span>: <span class="number">0.0</span></span><br><span class="line"><span class="attr">lowest_speed</span>: <span class="number">0.1</span></span><br><span class="line"></span><br><span class="line"><span class="attr">static_obs_nudge_speed_ratio</span>: <span class="number">0.3</span></span><br><span class="line"></span><br><span class="line"><span class="attr">dynamic_obs_nudge_speed_ratio</span>: <span class="number">0.8</span></span><br><span class="line"><span class="attr">enable_nudge_slowdown</span>: <span class="literal">true</span></span><br><span class="line"><span class="attr">lane_change_obstacle_nudge_l_buffer</span>: <span class="number">0.3</span></span><br><span class="line"><span class="attr">max_trajectory_len</span>: <span class="number">1000.0</span></span><br><span class="line"></span><br><span class="line"><span class="attr">collision_safety_range</span>: <span class="number">5.0</span></span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/JerT9dIjAzR3y6V.png" alt="Untitled 5.png"></p><p>注：修改好后记得保存修改 Ctrl+S 。</p><p>如果出现模块打不开或者障碍物不显示及场景不播放等问题，在打开dreamview的终端输入aem bootstrap restart重新启动即可。</p><p><img src="https://s2.loli.net/2023/11/29/oW3PAN8qntGOErU.png" alt="Untitled 6.png"></p><p>之后在DreamView中重新开启Planning模块，再次选择场景，即可看到绕行时被控制在速度在5m/s以下。</p><p><img src="https://s2.loli.net/2023/11/29/7oCQFwglpc2AUi4.png" alt="Untitled 7.png"></p><h2 id="四、提交评测"><a href="#四、提交评测" class="headerlink" title="四、提交评测"></a>四、提交评测</h2><p>（1）压缩包制作</p><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">#进入工程文件夹，打开终端，输入下面代码，即可在工程文件夹里看到压缩包</span><br><span class="line">tar -zcvf 自己取压缩包的名字.<span class="property">tar</span>.<span class="property">gz</span> modules/planning/ profiles/</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/29/ZWtA8JoezqVnMPf.png" alt="Untitled 8.png"></p><p>（2）代码提交</p><p>将压缩包拖入提交区域即可，注意不要拖到别的地方。</p><p>目前只可在赛前练习提交测评，之后可在提交记录里查看成绩和测评结果。</p><p><img src="https://s2.loli.net/2023/11/29/ShqJ7anlm68Y5I4.png" alt="Untitled 9.png"></p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Planning-车辆故障绕行&quot;&gt;&lt;a href=&quot;#Planning-车辆故障绕行&quot; class=&quot;headerlink&quot; title=&quot;Planning 车辆故障绕行&quot;&gt;&lt;/a&gt;Planning 车辆故障绕行&lt;/h1&gt;&lt;p&gt;百度Apollo自动驾驶仿真平台9.</summary>
      
    
    
    
    
    <category term="Apollo" scheme="https://yang-makabaka.github.io/tags/Apollo/"/>
    
  </entry>
  
  <entry>
    <title>perception 介绍</title>
    <link href="https://yang-makabaka.github.io/posts/85b392e2.html"/>
    <id>https://yang-makabaka.github.io/posts/85b392e2.html</id>
    <published>2023-11-27T10:55:06.000Z</published>
    <updated>2023-11-27T13:54:44.157Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Perception-介绍"><a href="#Perception-介绍" class="headerlink" title="Perception 介绍"></a>Perception 介绍</h1><p>百度Apollo自动驾驶仿真平台9.0版本Perception模块相关内容</p><h1 id="一、感知模块代码结构"><a href="#一、感知模块代码结构" class="headerlink" title="一、感知模块代码结构"></a>一、感知模块代码结构</h1><p><img src="https://s2.loli.net/2023/11/27/3Lm7a6gyjzrEcXV.png" alt="_E6_84_9F_E7_9F_A5_E4_BB_A3_E7_A0_81_E7_BB_93_E6_9E_84.png"></p><h2 id="1、Camera代码结构"><a href="#1、Camera代码结构" class="headerlink" title="1、Camera代码结构"></a>1、Camera代码结构</h2><p><img src="https://s2.loli.net/2023/11/27/L9ugtfmaWkFTDCw.png" alt="_E6_88_AA_E5_9B_BE_2023-10-08_13-48-55.png"></p><h2 id="2、Lidar代码结构"><a href="#2、Lidar代码结构" class="headerlink" title="2、Lidar代码结构"></a>2、Lidar代码结构</h2><p><img src="https://s2.loli.net/2023/11/27/nlJZWf5owIYPFXc.png" alt="_E6_84_9F_E7_9F_A5Lidar_E6_A8_A1_E5_9D_97_E4_BB_A3_E7_A0_81_E7_BB_93_E6_9E_84.png"></p><h2 id="3、Radar代码结构"><a href="#3、Radar代码结构" class="headerlink" title="3、Radar代码结构"></a>3、Radar代码结构</h2><p><img src="https://s2.loli.net/2023/11/27/TKJgPB3ML648jF9.png" alt="_E6_84_9F_E7_9F_A5Radar_E6_A8_A1_E5_9D_97_E4_BB_A3_E7_A0_81_E7_BB_93_E6_9E_84.png"></p><h2 id="4、Fusion代码结构"><a href="#4、Fusion代码结构" class="headerlink" title="4、Fusion代码结构"></a>4、Fusion代码结构</h2><p><img src="https://s2.loli.net/2023/11/27/jOAenw1GCpiDSrm.png" alt="_E6_84_9F_E7_9F_A5Fusion_E6_A8_A1_E5_9D_97_E4_BB_A3_E7_A0_81_E7_BB_93_E6_9E_84.png"></p><h1 id="二、Lidar感知"><a href="#二、Lidar感知" class="headerlink" title="二、Lidar感知"></a>二、Lidar感知</h1><h2 id="1、组成分析"><a href="#1、组成分析" class="headerlink" title="1、组成分析"></a>1、组成分析</h2><p><img src="https://s2.loli.net/2023/11/27/BOQPJ3U4dv7MIln.png" alt="_E6_BF_80_E5_85_89_E9_9B_B7_E8_BE_BE_E6_84_9F_E7_9F_A5_E6_B5_81_E7_A8_8B.png"></p><p><img src="https://s2.loli.net/2023/11/27/38nmBJDuveUIlzk.png" alt="Lidar_E6_84_9F_E7_9F_A5_E4_BB_A3_E7_A0_81_E5_8A_9F_E8_83_BD.png"></p><p><img src="https://s2.loli.net/2023/11/27/OtfpiU2nsJqWuFG.png" alt="Lidar_E6_84_9F_E7_9F_A5_E9_85_8D_E7_BD_AE_E6_96_87_E4_BB_B6.png"></p><p><img src="https://s2.loli.net/2023/11/27/AywP4mNEIO7vRQC.png" alt="_E7_BB_A7_E6_89_BF_E7_B1_BB_E5_85_B3_E7_B3_BB.png"></p><h2 id="2、函数解析"><a href="#2、函数解析" class="headerlink" title="2、函数解析"></a>2、函数解析</h2><p><img src="https://s2.loli.net/2023/11/27/BEl745vJr6D12IA.png" alt="Untitled.png"></p><h3 id="（1）LidarDetectionComponent-Init"><a href="#（1）LidarDetectionComponent-Init" class="headerlink" title="（1）LidarDetectionComponent::Init()"></a>（1）LidarDetectionComponent::Init()</h3><p>进行Component初始化，通过该函数完成传感器的初始化，配置文件的读取，pipeline的配置文件初始化等操作。</p><h3 id="（2）InternalProc"><a href="#（2）InternalProc" class="headerlink" title="（2）InternalProc()"></a>（2）InternalProc()</h3><p>是LidarDetectionComponent中的核心函数，该函数完成了点云数据的结构到LidarFrame数据结构的转换，并且会调用回调函数（具体的处理逻辑）</p><p><img src="https://s2.loli.net/2023/11/27/Tl5HRjZQhGedWz6.png" alt="Untitled 1.png"></p><h3 id="（3）LidarObstacleDetection-Init"><a href="#（3）LidarObstacleDetection-Init" class="headerlink" title="（3）LidarObstacleDetection::Init()"></a>（3）LidarObstacleDetection::Init()</h3><p>是Detection的初始化函数，该函数会通过Pipeline的配置文件，对各个Stage以及Task去进行一个实例化，并且对Stage的配置文件进行一个初始化的操作。</p><p><img src="https://s2.loli.net/2023/11/27/HmrUd8VEuKGwMDY.png" alt="Untitled 2.png"></p><h3 id="（4）LidarObstacleDetection-Process"><a href="#（4）LidarObstacleDetection-Process" class="headerlink" title="（4）LidarObstacleDetection::Process()"></a>（4）LidarObstacleDetection::Process()</h3><p>按照pipeline中配置文件的顺序，循环的调用各个不同Stage，不同Task的Process函数，用户可以根据自身的情况，设置不同的检测算法，以及不同的前后处理算法， 对Stage去进行一个不同的组合。</p><h2 id="3、数据结构解析"><a href="#3、数据结构解析" class="headerlink" title="3、数据结构解析"></a>3、数据结构解析</h2><h3 id="1-查看点云数据"><a href="#1-查看点云数据" class="headerlink" title="1 查看点云数据"></a>1 查看点云数据</h3><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">(<span class="number">1</span>)终端输入</span><br><span class="line">cyber_recorder play -f databag/sensor_rgb.<span class="property">record</span> -l</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/27/YUgC2r4bPNxwoVu.png" alt="Untitled 3.png"></p><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">（<span class="number">2</span>）重新打开一个终端</span><br><span class="line">cyber_monitor</span><br></pre></td></tr></table></figure><p><img src="https://s2.loli.net/2023/11/27/HOxzd4oDGJCcNBR.png" alt="Untitled 4.png"></p><p>选择激光雷达点云数据，右键打开查看</p><p><img src="https://s2.loli.net/2023/11/27/2rcujpPHTLC4m3Y.png" alt="Untitled 5.png"></p><p>MessageType：channel的数据格式</p><p>FrameRatio：channel的帧率</p><p>header：channel的头文件信息（时间戳，时序，frame_id等信息）</p><p>Point：激光雷达点云的具体数据（包含点云的位置 强度 时间戳等信息）</p><h3 id="2-在Dreamview查看点云数据"><a href="#2-在Dreamview查看点云数据" class="headerlink" title="2 在Dreamview查看点云数据"></a>2 在Dreamview查看点云数据</h3><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">启动<span class="title class_">Dreamview</span></span><br><span class="line">bash scripts/bootstrap_neo.<span class="property">sh</span></span><br></pre></td></tr></table></figure><p>点击layer menu ——&gt; 打开point cloud  就可以在Dreamiew上查看激光点云的数据。</p><h3 id="3-激光雷达感知的数据结构"><a href="#3-激光雷达感知的数据结构" class="headerlink" title="3 激光雷达感知的数据结构"></a>3 激光雷达感知的数据结构</h3><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">（<span class="number">1</span>）使用如下命令播放数据包</span><br><span class="line">cyber_recorder play -f databag/demo_3<span class="number">.5</span>.<span class="property">record</span> -l</span><br></pre></td></tr></table></figure><figure class="highlight jsx"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">（<span class="number">2</span>）重新打开一个终端</span><br><span class="line">cyber_monitor</span><br></pre></td></tr></table></figure><p>选择 /apollo/perception/obstacles</p><p><img src="https://s2.loli.net/2023/11/27/Mrg5Y9vncHkeNA2.png" alt="Untitled 6.png"><br>perception_obstacle：感知中具体的感知结果</p><p>id：tracking之后的障碍物信息</p><p>position：障碍物在世界坐标系中的位置</p><p>theta：障碍物在世界坐标系中的朝向信息</p><p>velocity：障碍物的速度</p><p>length、width、height：障碍物的尺寸信息</p><p>type：障碍物的类型</p><h1 id="三、感知传感器"><a href="#三、感知传感器" class="headerlink" title="三、感知传感器"></a>三、感知传感器</h1><p><img src="https://s2.loli.net/2023/11/27/cAr6zEYuLJ8421B.png" alt="Untitled 7.png"></p><p><img src="https://s2.loli.net/2023/11/27/pN3Bo7nYD4l9QXT.png" alt="Untitled 8.png"></p><p><img src="https://s2.loli.net/2023/11/27/lAmRUPQdH2jh4ag.png" alt="Untitled 9.png"></p><p><img src="https://s2.loli.net/2023/11/27/CRL6TetDxg2dPmY.png" alt="Untitled 10.png"></p><p><img src="https://s2.loli.net/2023/11/27/z7ijcIhDJ5KQa21.png" alt="Untitled 11.png"></p><p><img src="https://s2.loli.net/2023/11/27/kMrq34Bz9eZpyn6.png" alt="Untitled 12.png"></p><p><img src="https://s2.loli.net/2023/11/27/Z8xP6KTsYiVdk93.png" alt="Untitled 13.png"></p><p><img src="https://s2.loli.net/2023/11/27/EHbBZgsWeyCTU2o.png" alt="Untitled 14.png"></p><p><img src="https://s2.loli.net/2023/11/27/flKQF5TPq7m68kr.png" alt="Untitled 15.png"></p><p><img src="https://s2.loli.net/2023/11/27/nOC3YjuyS8ti4ra.png" alt="Untitled 16.png"></p><h1 id="四、课程链接"><a href="#四、课程链接" class="headerlink" title="四、课程链接"></a>四、课程链接</h1><p><a href="https://apollo.baidu.com/community/article/1133">perception 2.0 综述_Apollo开发者社区</a></p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Perception-介绍&quot;&gt;&lt;a href=&quot;#Perception-介绍&quot; class=&quot;headerlink&quot; title=&quot;Perception 介绍&quot;&gt;&lt;/a&gt;Perception 介绍&lt;/h1&gt;&lt;p&gt;百度Apollo自动驾驶仿真平台9.0版本Perce</summary>
      
    
    
    
    
    <category term="Apollo" scheme="https://yang-makabaka.github.io/tags/Apollo/"/>
    
  </entry>
  
  <entry>
    <title>Ubuntu安装与使用miniconda3</title>
    <link href="https://yang-makabaka.github.io/posts/4120ac2f.html"/>
    <id>https://yang-makabaka.github.io/posts/4120ac2f.html</id>
    <published>2023-04-21T15:35:24.000Z</published>
    <updated>2023-04-21T15:44:02.371Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Ubuntu安装与使用miniconda3"><a href="#Ubuntu安装与使用miniconda3" class="headerlink" title="Ubuntu安装与使用miniconda3"></a>Ubuntu安装与使用miniconda3</h1><h3 id="1-确保所有系统包都是最新的"><a href="#1-确保所有系统包都是最新的" class="headerlink" title="1. 确保所有系统包都是最新的"></a>1. 确保所有系统包都是最新的</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">sudo apt update</span><br><span class="line">sudo apt upgrade</span><br></pre></td></tr></table></figure><h3 id="2-官网下载miniconda"><a href="#2-官网下载miniconda" class="headerlink" title="2.官网下载miniconda"></a>2.官网下载miniconda</h3><p><a href="https://docs.conda.io/en/latest/miniconda.html#linux-installers">https://docs.conda.io/en/latest/miniconda.html#linux-installers</a></p><h3 id="3-安装miniconda"><a href="#3-安装miniconda" class="headerlink" title="3.安装miniconda"></a>3.安装miniconda</h3><p>（1）在文件下载目录打开终端，一般是Downloads，输入以下代码开始安装</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo sh Miniconda3-py39_23.1.0-1-Linux-x86_64.sh(下载的文件名，根据实际下载的文件名更改)</span><br></pre></td></tr></table></figure><p>根据提示按Ehter，和输入yes后，当询问安装到默认目录还是选择其它目录时，推荐输入下列位置代码（一般软件都安装到此）</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/opt/miniconda3</span><br></pre></td></tr></table></figure><p>初始化变量选择yes</p><p>（2）取消自动进入base环境</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda config --<span class="built_in">set</span> auto_activate_base <span class="literal">false</span></span><br></pre></td></tr></table></figure><p>（3）手动初始化</p><p>安装vim</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt-get install vim</span><br></pre></td></tr></table></figure><p>输入下面代码设置环境变量</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo vim /etc/bash.bashrc</span><br></pre></td></tr></table></figure><p>将下面内容加到里面</p><p>vim操作教程：</p><p>i  开始编辑</p><p>Esc 退出编辑</p><p>:wq 保存退出</p><pre><code>if [ -d &quot;/opt/miniconda3/bin/&quot; ] ; then  export PATH=/opt/miniconda3/bin:$PATHfi</code></pre><p><img src="/home/yangfangzheng/.config/Typora/typora-user-images/image-20230417235718596.png" alt="image-20230417235718596"></p><p>（4）重载环境变量</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">source</span> /etc/bash.bashrc</span><br></pre></td></tr></table></figure><p>（5）换源</p><pre><code>conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/msys2/conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/conda config --set show_channel_urls yes</code></pre><h3 id="4-创建第一个虚拟环境"><a href="#4-创建第一个虚拟环境" class="headerlink" title="4.创建第一个虚拟环境"></a>4.创建第一个虚拟环境</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#envname:所创建的环境名字，要记住</span></span><br><span class="line"><span class="comment">#python=3.x:虚拟环境里python的版本，如python=3.6</span></span><br><span class="line">conda create --name envname python=3.x</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">source</span> activate <span class="comment">#激活环境，此时终端行前会出现(base)</span></span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda activate envname(上面创建的环境名) <span class="comment">#此时行前括号内容由(bash)变为(你创建的环境名)</span></span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">python</span><br><span class="line"><span class="comment">#此时输入python会有类似如下内容，现在就可以使用环境里的python了</span></span><br><span class="line"><span class="comment">#(opencv) yangfangzheng@yangfangzheng:~$ python</span></span><br><span class="line">Python 3.6.15 | packaged by conda-forge | (default, Dec  3 2021, 18:49:41) </span><br><span class="line">[GCC 9.4.0] on linux</span><br><span class="line">Type <span class="string">&quot;help&quot;</span>, <span class="string">&quot;copyright&quot;</span>, <span class="string">&quot;credits&quot;</span> or <span class="string">&quot;license&quot;</span> <span class="keyword">for</span> more information.</span><br><span class="line">&gt;&gt;&gt; <span class="built_in">print</span>(1212)</span><br><span class="line">1212</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda deactivate <span class="comment">#退出虚拟环境</span></span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda <span class="built_in">env</span> remove -n 虚拟环境名 <span class="comment">#删除已创建的虚拟环境</span></span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">conda install python=3.9 <span class="comment">#升级虚拟环境中的python版本,升级到python3.9,先进入到对应虚拟环境中，再执行</span></span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 复制别人环境</span></span><br><span class="line"><span class="comment">#（1）将他人的虚拟环境，复制到/opt/miniconda3/envs</span></span><br><span class="line"><span class="comment">#（2）添加conda env</span></span><br><span class="line">conda config --add envs_dirs /opt/miniconda3/envs/环境名</span><br><span class="line"><span class="comment">#（3）找到虚拟环境，进入bin目录，第一行修改为自己的路径。</span></span><br></pre></td></tr></table></figure><h3 id="5-出现问题怎么办？"><a href="#5-出现问题怎么办？" class="headerlink" title="5.出现问题怎么办？"></a>5.出现问题怎么办？</h3><h5 id="E-无法修正错误-因为您要求某些软件包保持现状-就是它们破坏了软件包间的依赖关系"><a href="#E-无法修正错误-因为您要求某些软件包保持现状-就是它们破坏了软件包间的依赖关系" class="headerlink" title="E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系"></a>E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系</h5><p>首先需要安装 aptitude：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo apt-get install aptitude</span><br></pre></td></tr></table></figure><p>aptitude 安装包：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo aptitude install openssh-server</span><br></pre></td></tr></table></figure><p>如果方案中仍然存在未解决的依赖，可以选择 n，aptitude 会重新计算可行方案，包括对已存在的包进行降级等。</p><p>之后卸载conda，重装即可。</p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Ubuntu安装与使用miniconda3&quot;&gt;&lt;a href=&quot;#Ubuntu安装与使用miniconda3&quot; class=&quot;headerlink&quot; title=&quot;Ubuntu安装与使用miniconda3&quot;&gt;&lt;/a&gt;Ubuntu安装与使用miniconda3&lt;/</summary>
      
    
    
    
    
    <category term="Linux" scheme="https://yang-makabaka.github.io/tags/Linux/"/>
    
  </entry>
  
  <entry>
    <title>OpenCV——7</title>
    <link href="https://yang-makabaka.github.io/posts/b5e212c6.html"/>
    <id>https://yang-makabaka.github.io/posts/b5e212c6.html</id>
    <published>2022-12-24T13:36:06.000Z</published>
    <updated>2022-12-24T13:40:51.753Z</updated>
    
    <content type="html"><![CDATA[<h1 id="25像素重映射"><a href="#25像素重映射" class="headerlink" title="25像素重映射"></a>25<strong>像素重映射</strong></h1><h4 id="把像素点P-x-y-重新映射到一个新的位置P’-x’-y’"><a href="#把像素点P-x-y-重新映射到一个新的位置P’-x’-y’" class="headerlink" title="把像素点P(x,y)重新映射到一个新的位置P’(x’, y’)"></a>把像素点P(x,y)重新映射到一个新的位置P’(x’, y’)</h4><h4 id="像素重映射函数"><a href="#像素重映射函数" class="headerlink" title="像素重映射函数"></a>像素重映射函数</h4><h4 id="cv-remap-src-map1-map2-interpolation-dst-borderMode-borderValue-gt-dst"><a href="#cv-remap-src-map1-map2-interpolation-dst-borderMode-borderValue-gt-dst" class="headerlink" title="cv.remap(src, map1, map2, interpolation[, dst[, borderMode[, borderValue]]] ) -&gt;dst"></a>cv.remap(src, map1, map2, interpolation[, dst[, borderMode[, borderValue]]] ) -&gt;dst</h4><p>​    •src表示图像</p><p>​    •map1表示x,y方向映射规则，或者x方向映射</p><p>​    •Map2如果map1表示x,y映射时为空，否则表示y</p><p>​    •表示映射时候的像素插值方法 支持：INTER_NEAREST 、NTER_LINEAR 、NTER_CUBIC</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line">#25像素重映射</span><br><span class="line">def remap_demo():</span><br><span class="line">    image = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">    cv.namedWindow(&quot;remap-demo&quot;, cv.WINDOW_AUTOSIZE)</span><br><span class="line">    cv.createTrackbar(&quot;remap-type&quot;, &quot;remap-demo&quot;, 0, 3, trackbar_callback)</span><br><span class="line">    h, w, c = image.shape</span><br><span class="line">    cv.imshow(&quot;input&quot;, image)</span><br><span class="line">    map_x = np.zeros((h, w), dtype=np.float32)</span><br><span class="line">    map_y = np.zeros((h, w), dtype=np.float32)</span><br><span class="line"></span><br><span class="line">    while True:</span><br><span class="line">        pos = cv.getTrackbarPos(&quot;remap-type&quot;, &quot;remap-demo&quot;)</span><br><span class="line">        if pos == 0:  # 倒立</span><br><span class="line">            for i in range(map_x.shape[0]):</span><br><span class="line">                map_x[i, :] = [x for x in range(map_x.shape[1])]</span><br><span class="line">            for j in range(map_y.shape[1]):</span><br><span class="line">                map_y[:, j] = [map_y.shape[0] - y for y in range(map_y.shape[0])]</span><br><span class="line">        elif pos == 1:  # 镜像</span><br><span class="line">            for i in range(map_x.shape[0]):</span><br><span class="line">                map_x[i, :] = [map_x.shape[1] - x for x in range(map_x.shape[1])]</span><br><span class="line">            for j in range(map_y.shape[1]):</span><br><span class="line">                map_y[:, j] = [y for y in range(map_y.shape[0])]</span><br><span class="line">        elif pos == 2:  # 对象线对称</span><br><span class="line">            for i in range(map_x.shape[0]):</span><br><span class="line">                map_x[i, :] = [map_x.shape[1] - x for x in range(map_x.shape[1])]</span><br><span class="line">            for j in range(map_y.shape[1]):</span><br><span class="line">                map_y[:, j] = [map_y.shape[0] - y for y in range(map_y.shape[0])]</span><br><span class="line">        elif pos == 3:  # 放大两倍</span><br><span class="line">            for i in range(map_x.shape[0]):</span><br><span class="line">                map_x[i, :] = [int(x/2) for x in range(map_x.shape[1])]</span><br><span class="line">            for j in range(map_y.shape[1]):</span><br><span class="line">                map_y[:, j] = [int(y/2) for y in range(map_y.shape[0])]</span><br><span class="line"></span><br><span class="line">        dst = cv.remap(image, map_x, map_y, cv.INTER_LINEAR)</span><br><span class="line">        cv.imshow(&quot;remap-demo&quot;, dst)</span><br><span class="line">        c = cv.waitKey(100)</span><br><span class="line">        if c == 27:</span><br><span class="line">            break</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="26图像二值化"><a href="#26图像二值化" class="headerlink" title="26图像二值化"></a>26<strong>图像二值化</strong></h1><h4 id="图像二值化定义"><a href="#图像二值化定义" class="headerlink" title="图像二值化定义"></a>图像二值化定义</h4><p>•只有两个像素值0、1(0表示黑色，1-255表示白色)，黑色表示背景，白色表示对象（规则）</p><h4 id="图像二值化方法"><a href="#图像二值化方法" class="headerlink" title="图像二值化方法"></a>图像二值化方法</h4><p>•cv.mean,计算灰度图像均值<em>m</em></p><p>•inRange方法分割</p><h4 id="二值化函数"><a href="#二值化函数" class="headerlink" title="二值化函数"></a>二值化函数</h4><p><strong>cv.threshold</strong>(<strong>src</strong>, thresh,<strong>maxval</strong>, type[,<strong>dst</strong>]) -&gt;<strong>retval</strong>, <strong>dst</strong></p><p>​    <strong>src</strong>表示输入图像</p><p>​    <strong>thresh</strong>表示阈值</p><p>​    <strong>maxval</strong>表示最大值</p><p>​    <strong>type</strong>表示  <strong>二值化THRESH_BINARY</strong>   或者   <strong>二值化反THRESH_BINARY_INV</strong></p><p>​    <strong>retval</strong>表示返回阈值，<strong>dst</strong>返回的二值图像</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"># 26图像二值化</span><br><span class="line">def binary_demo():</span><br><span class="line">    image = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">    gray = cv.cvtColor(image, cv.COLOR_BGR2GRAY)</span><br><span class="line">    cv.imshow(&quot;gray&quot;, gray)</span><br><span class="line"></span><br><span class="line">    # 手动阈值，二值化</span><br><span class="line">    ret, binary = cv.threshold(gray, 127, 255, cv.THRESH_BINARY)</span><br><span class="line">    cv.imshow(&quot;binary&quot;,binary)</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line"></span><br><span class="line">    # 求均值，二值化</span><br><span class="line">    m = cv.mean(gray)[0]</span><br><span class="line">    ret, binary = cv.threshold(gray, m, 255, cv.THRESH_BINARY)</span><br><span class="line">    cv.imshow(&quot;binary&quot;, binary)</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line"></span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="27全局与自适应二值化"><a href="#27全局与自适应二值化" class="headerlink" title="27全局与自适应二值化"></a>27<strong>全局与自适应二值化</strong></h1><h3 id="全局二值化"><a href="#全局二值化" class="headerlink" title="全局二值化"></a>全局二值化</h3><h4 id="（1）大津法（针对两峰）：0-5六个灰度级别，根据直方图分布，以每个灰度等级分割直方图分布为两个部分，分别求取均值跟方差，如图示，最小方法差和对应的灰度值为，分割阈值"><a href="#（1）大津法（针对两峰）：0-5六个灰度级别，根据直方图分布，以每个灰度等级分割直方图分布为两个部分，分别求取均值跟方差，如图示，最小方法差和对应的灰度值为，分割阈值" class="headerlink" title="（1）大津法（针对两峰）：0~5六个灰度级别，根据直方图分布，以每个灰度等级分割直方图分布为两个部分，分别求取均值跟方差，如图示，最小方法差和对应的灰度值为，分割阈值."></a>（1）<strong>大津法</strong>（针对两峰）：<strong>0~5</strong>六个灰度级别，根据直方图分布，以每个灰度等级分割直方图分布为两个部分，分别求取均值跟方差，如图示，最小方法差和对应的灰度值为，分割阈值.</h4><p><img src="https://s3.bmp.ovh/imgs/2022/12/24/ac02c9856935afa6.png" alt=""></p><h4 id="（2）三角法（针对单峰）"><a href="#（2）三角法（针对单峰）" class="headerlink" title="（2）三角法（针对单峰）"></a>（2）三角法（针对单峰）</h4><p><img src="https://s3.bmp.ovh/imgs/2022/12/24/c0c9b34c866781c5.png" alt=""></p><p>α和β角都为45°，最长的d对应的点偏移0.2即为阈值点。</p><h4 id="两种方法都是基于直方图分布"><a href="#两种方法都是基于直方图分布" class="headerlink" title="两种方法都是基于直方图分布"></a>两种方法都是基于直方图分布</h4><h3 id="全局二值化函数"><a href="#全局二值化函数" class="headerlink" title="全局二值化函数"></a>全局二值化函数</h3><p><strong>cv.threshold(src, thresh,maxval, type[,dst]) -&gt;retval,dst</strong></p><p>​    •<strong>type</strong>表示二值化</p><p>​        •<strong>THRESH_BINARY | THRESH_OTSU  全局自动阈值＋二值化（大津）</strong></p><p>​        •<strong>THRESH_BINARY | THRESH_TRIANGLE  全局自动阈值＋二值化（三角）</strong></p><p>​        •<strong>THRESH_BINARY_INV | THRESH_OTSU</strong></p><p>​        <strong>表示不同的全局二值化方法</strong></p><h3 id="自适应二值化"><a href="#自适应二值化" class="headerlink" title="自适应二值化"></a>自适应二值化</h3><p>•<strong>模糊图像 D</strong>（可以为均值模糊/高斯模糊）</p><p>•<strong>原图</strong>S + 加上<strong>偏置常量</strong>C</p><p>•<strong>T = S –D &gt; -C ? 255 : 0</strong></p><h4 id="自适应二值化函数"><a href="#自适应二值化函数" class="headerlink" title="自适应二值化函数"></a>自适应二值化函数</h4><p><strong>cv.adaptiveThreshold</strong>(<strong>src</strong>, <strong>maxValue</strong>, <strong>adaptiveMethod</strong>, <strong>thresholdType</strong>, <strong>blockSize</strong>, C[,<strong>dst</strong>] ) -&gt;<strong>dst</strong></p><p>​    •cv.ADAPTIVE_THRESH_MEAN_C         均值</p><p>​    cv.ADAPTIVE_THRESH_GAUSSIAN_C   高斯</p><p>​    •<strong>blockSize</strong>必须为奇数</p><p>​    •<strong>C</strong>表示要减去的权重，可以是正数，负数，0</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"># 27全局与自适应二值化</span><br><span class="line">def binarier_demo():</span><br><span class="line">    image = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">    gray = cv.cvtColor(image, cv.COLOR_BGR2GRAY)</span><br><span class="line">    cv.imshow(&quot;gray&quot;, gray)</span><br><span class="line"></span><br><span class="line">    # 手动阈值，大津法</span><br><span class="line">    ret, binary = cv.threshold(gray, 0, 255, cv.THRESH_BINARY | cv.THRESH_OTSU)</span><br><span class="line">    cv.imshow(&quot;binary1&quot;, binary)</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line"></span><br><span class="line">    # 手动阈值，三角法</span><br><span class="line">    ret, binary = cv.threshold(gray, 0, 255, cv.THRESH_BINARY | cv.THRESH_TRIANGLE)</span><br><span class="line">    cv.imshow(&quot;binary2&quot;, binary)</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line"></span><br><span class="line">    # 自适应法</span><br><span class="line">    binary = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 25, 10)</span><br><span class="line">    cv.imshow(&quot;binary3&quot;, binary)</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line"></span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="28实时人脸检测"><a href="#28实时人脸检测" class="headerlink" title="28实时人脸检测"></a><strong>28实时人脸检测</strong></h1><h3 id="OpenCV4-DNN模块"><a href="#OpenCV4-DNN模块" class="headerlink" title="OpenCV4 DNN模块"></a><strong>OpenCV4</strong> <strong>DNN</strong>模块</h3><p>•来自另外一个开源项目tiny dnn</p><p>•OpenCV3.3正式发布</p><p>•最新版本OpenCV4.5.5</p><p>•支持后台硬件加速机制 CPU/GPU等</p><p>•支持多种任务(分类、检测、分割、风格迁移、场景文字检测等)</p><p>•只支持推理（模型部署），不支持模型训练</p><p>•支持主流的深度学习框架生成模型，OpenCV加载</p><p>•推荐使用pytorch/tensorflow</p><h3 id="OpenCV人脸检测支持演化"><a href="#OpenCV人脸检测支持演化" class="headerlink" title="OpenCV人脸检测支持演化"></a>OpenCV人脸检测支持演化</h3><p>•OpenCV3.3之前基于HAAR/LBP级联检测</p><p>•OpenCV3.3开始支持深度学习人脸检测</p><p>•支持人脸检测模型caffe/tensorflow</p><p>•OpenCV4.5.4 支持人脸检测+landmark</p><p>•模型下载地址：</p><p>•<a href="https://gitee.com/opencv_ai/opencv_tutorial_data">https://gitee.com/opencv_ai/opencv_tutorial_data</a></p><h3 id="DNN相关函数"><a href="#DNN相关函数" class="headerlink" title="DNN相关函数"></a>DNN相关函数</h3><p>•读取模型：readNetFromTensorflow</p><p>•转换为blob对象：blobFromImage</p><p>•设置输入：setInput</p><p>•推理预测：forward</p><h3 id="人脸检测显示"><a href="#人脸检测显示" class="headerlink" title="人脸检测显示"></a>人脸检测显示</h3><p>•模型输入:1x3x300x300</p><p>•模型输出:1xN（张人脸）x7（个数据）</p><p>​    人脸检测框坐标（左上右下） – 后面四个值</p><p>​    预测置信度（score） – 第三个值</p><p>​    class_id（类别） – 第一个值</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br></pre></td><td class="code"><pre><span class="line"># 人脸检测</span><br><span class="line">#下载下面两个模型到项目地址（https://gitee.com/opencv_ai/opencv_tutorial_data）</span><br><span class="line">model_bin = &quot;opencv_face_detector_uint8.pb&quot;</span><br><span class="line">config_text = &quot;opencv_face_detector.pbtxt&quot;;</span><br><span class="line"></span><br><span class="line"># 视频人脸检测</span><br><span class="line">def video_detection():</span><br><span class="line">    font = cv.FONT_HERSHEY_SIMPLEX</span><br><span class="line">    font_scale = 0.5</span><br><span class="line">    thickness = 1</span><br><span class="line"></span><br><span class="line">    #load tensorflow model</span><br><span class="line">    net = cv.dnn.readNetFromTensorflow(model_bin, config=config_text)</span><br><span class="line">    capture = cv.VideoCapture(0) #获取摄像头图像</span><br><span class="line">    # 人脸检测</span><br><span class="line">    while True:</span><br><span class="line">        e1 = cv.getTickCount()</span><br><span class="line">        ret, frame = capture.read()</span><br><span class="line">        frame = cv.flip(frame, 1)</span><br><span class="line">        if ret is not True:</span><br><span class="line">            break</span><br><span class="line">        h, w, c = frame.shape</span><br><span class="line">        blobImage = cv.dnn.blobFromImage(frame, 1.0, (300, 300), (104.0, 177.0, 123.0), False, False);</span><br><span class="line">        net.setInput(blobImage)</span><br><span class="line">        cvOut = net.forward()</span><br><span class="line">        print(cvOut.shape)</span><br><span class="line"></span><br><span class="line">        # Put efficiency information</span><br><span class="line">        t, _ = net.getPerfProfile()</span><br><span class="line">        label = &#x27;Inference time: %.2f ms&#x27; % (t * 1000.0 / cv.getTickFrequency())</span><br><span class="line"></span><br><span class="line">        # 绘制检测矩阵</span><br><span class="line">        for detection in cvOut[0, 0, :, :]:</span><br><span class="line">            score = float(detection[2])</span><br><span class="line">            objIndex = int(detection[1])</span><br><span class="line">            if score &gt; 0.5:</span><br><span class="line">                left = detection[3] * w</span><br><span class="line">                top = detection[4] * h</span><br><span class="line">                right = detection[5] * w</span><br><span class="line">                bottom = detection[6] * h</span><br><span class="line"></span><br><span class="line">                # 绘制矩形框</span><br><span class="line">                cv.rectangle(frame, (int(left), int(top)), (int(right), int(bottom)), (255, 0, 0),thickness=2)</span><br><span class="line"></span><br><span class="line">                # 绘制类别跟得分</span><br><span class="line">                label_txt = &quot;score: %.2f&quot;%score</span><br><span class="line">                (fw, uph), dh = cv.getTextSize(label_txt, font, font_scale, thickness)</span><br><span class="line">                cv.rectangle(frame, (int(left), int(top) - uph - dh), (int(left) + fw, int(top)), (255, 255, 255), -1, 8)</span><br><span class="line">                cv.putText(frame, label_txt, (int(left), int(top) - dh), font, font_scale, (255, 0, 255), thickness)</span><br><span class="line"></span><br><span class="line">        e2 = cv.getTickCount()</span><br><span class="line">        fps = cv.getTickFrequency() / (e2 - e1)</span><br><span class="line">        cv.putText(frame, label + (&quot; FPS: %.2f&quot;%fps), (10, 50), cv.FONT_HERSHEY_SIMPLEX, 1.0, (0, 0, 255), 2)</span><br><span class="line">        cv.imshow(&#x27;face-detection-demo&#x27;, frame)</span><br><span class="line">        c = cv.waitKey(1)</span><br><span class="line">        if c == 27:</span><br><span class="line">            break</span><br><span class="line">    cv.destroyAllWindows()</span><br><span class="line"></span><br><span class="line"># 图片人脸检测</span><br><span class="line">def image_detection():</span><br><span class="line">    font = cv.FONT_HERSHEY_SIMPLEX</span><br><span class="line">    font_scale = 0.5</span><br><span class="line">    thickness = 1</span><br><span class="line"></span><br><span class="line">    #load tensorflow model</span><br><span class="line">    net = cv.dnn.readNetFromTensorflow(model_bin, config=config_text)</span><br><span class="line">    capture = cv.VideoCapture(0)</span><br><span class="line">    # 人脸检测</span><br><span class="line">    e1 = cv.getTickCount()</span><br><span class="line">    frame = cv.imread(&quot;face.png&quot;)</span><br><span class="line">    h, w, c = frame.shape</span><br><span class="line">    blobImage = cv.dnn.blobFromImage(frame, 1.0, (300, 300), (104.0, 177.0, 123.0), False, False);</span><br><span class="line">    net.setInput(blobImage)</span><br><span class="line">    cvOut = net.forward()</span><br><span class="line">    print(cvOut.shape)</span><br><span class="line"></span><br><span class="line">    # Put efficiency information</span><br><span class="line">    t, _ = net.getPerfProfile()</span><br><span class="line">    label = &#x27;Inference time: %.2f ms&#x27; % (t * 1000.0 / cv.getTickFrequency())</span><br><span class="line"></span><br><span class="line">    # 绘制检测矩阵</span><br><span class="line">    for detection in cvOut[0, 0, :, :]:</span><br><span class="line">        score = float(detection[2])</span><br><span class="line">        objIndex = int(detection[1])</span><br><span class="line">        if score &gt; 0.5:</span><br><span class="line">            left = detection[3] * w</span><br><span class="line">            top = detection[4] * h</span><br><span class="line">            right = detection[5] * w</span><br><span class="line">            bottom = detection[6] * h</span><br><span class="line"></span><br><span class="line">            # 绘制矩形框</span><br><span class="line">            cv.rectangle(frame, (int(left), int(top)), (int(right), int(bottom)), (255, 0, 0), thickness=2)</span><br><span class="line"></span><br><span class="line">            # 绘制类别跟得分</span><br><span class="line">            label_txt = &quot;score: %.2f&quot; % score</span><br><span class="line">            (fw, uph), dh = cv.getTextSize(label_txt, font, font_scale, thickness)</span><br><span class="line">            cv.rectangle(frame, (int(left), int(top) - uph - dh), (int(left) + fw, int(top)), (255, 255, 255), -1, 8)</span><br><span class="line">            cv.putText(frame, label_txt, (int(left), int(top) - dh), font, font_scale, (255, 0, 255), thickness)</span><br><span class="line"></span><br><span class="line">    e2 = cv.getTickCount()</span><br><span class="line">    fps = cv.getTickFrequency() / (e2 - e1)</span><br><span class="line">    cv.putText(frame, label + (&quot; FPS: %.2f&quot; % fps), (10, 50), cv.FONT_HERSHEY_SIMPLEX, 1.0, (0, 0, 255), 2)</span><br><span class="line">    cv.imshow(&#x27;face-detection-demo&#x27;, frame)</span><br><span class="line">    c = cv.waitKey(0)</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><p>​    </p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;25像素重映射&quot;&gt;&lt;a href=&quot;#25像素重映射&quot; class=&quot;headerlink&quot; title=&quot;25像素重映射&quot;&gt;&lt;/a&gt;25&lt;strong&gt;像素重映射&lt;/strong&gt;&lt;/h1&gt;&lt;h4 id=&quot;把像素点P-x-y-重新映射到一个新的位置P’-x’-y</summary>
      
    
    
    
    
    <category term="CV" scheme="https://yang-makabaka.github.io/tags/CV/"/>
    
  </entry>
  
  <entry>
    <title>OpenCV——6</title>
    <link href="https://yang-makabaka.github.io/posts/c2e52250.html"/>
    <id>https://yang-makabaka.github.io/posts/c2e52250.html</id>
    <published>2022-12-22T08:24:59.000Z</published>
    <updated>2022-12-24T13:39:40.592Z</updated>
    
    <content type="html"><![CDATA[<h1 id="21图像直方图"><a href="#21图像直方图" class="headerlink" title="21图像直方图"></a>21<strong>图像直方图</strong></h1><h3 id="图像直方图函数"><a href="#图像直方图函数" class="headerlink" title="图像直方图函数"></a>图像直方图函数</h3><p>•<strong>calcHist(images, channels, mask,histSize, ranges[,hist[, accumulate]]) -&gt;hist</strong></p><p>•<strong>images</strong>表示图像</p><p>•<strong>channels</strong>表示通道</p><p>•<strong>mask<em> </em>默认</strong>None</p><p>•<strong>histSzie</strong>表示<strong>bin</strong>的个数，灰度等级</p><p>•<strong>ranges</strong>表示通道的取值范围</p><p>函数返回的直方图数据类型为np.float32</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">#21图像直方图</span><br><span class="line">def image_hist():</span><br><span class="line">    image = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">    cv.imshow(&quot;input&quot;, image)</span><br><span class="line">    color = (&#x27;blue&#x27;, &#x27;green&#x27;, &#x27;red&#x27;)</span><br><span class="line">    for i,color in enumerate(color):</span><br><span class="line">        hist = cv.calcHist([image], [i], None, [32], [0,256])</span><br><span class="line">        print(hist.dtype)</span><br><span class="line">        plt.plot(hist,color = color)</span><br><span class="line">        plt.xlim([0,32])</span><br><span class="line">    plt.show()</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="22直方图均衡化"><a href="#22直方图均衡化" class="headerlink" title="22直方图均衡化"></a>22<strong>直方图均衡化</strong></h1><h3 id="均衡化作用"><a href="#均衡化作用" class="headerlink" title="均衡化作用"></a><strong>均衡化作用</strong></h3><p>​    •提升对比度</p><p>​    •灰度图象支持</p><p>所谓均衡化就是减少Bin数量进而扩大各范围的差距</p><h3 id="直方图均衡化函数"><a href="#直方图均衡化函数" class="headerlink" title="直方图均衡化函数"></a>直方图均衡化函数</h3><p>•<strong>cv.equalizeHist(</strong>src<strong>[,<em> </em>dst</strong>]) -&gt;<strong>dst</strong></p><p>•<strong>src</strong>必须是八位单通道图像*</p><p>•<strong>dst</strong>返回结果图像，类型与<strong>src</strong>保持一致</p><h4 id="彩色直方图均衡化可以先转换到HSV空间然后对V通道均衡化（只对亮度通道增强）"><a href="#彩色直方图均衡化可以先转换到HSV空间然后对V通道均衡化（只对亮度通道增强）" class="headerlink" title="彩色直方图均衡化可以先转换到HSV空间然后对V通道均衡化（只对亮度通道增强）"></a>彩色直方图均衡化可以先转换到HSV空间然后对V通道均衡化（只对亮度通道增强）</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">#22直方图均衡化</span><br><span class="line">def hist_equ():</span><br><span class="line">    image = cv.imread(&quot;123.jpg&quot;,cv.IMREAD_GRAYSCALE)</span><br><span class="line">    cv.imshow(&quot;input&quot;,image)</span><br><span class="line">    hist = cv.calcHist([image], [0], None, [32], [0,256])</span><br><span class="line">    print(hist.dtype)</span><br><span class="line">    plt.plot(hist,color = &quot;gray&quot;)</span><br><span class="line">    plt.xlim([0,32])</span><br><span class="line">    plt.show()</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line"></span><br><span class="line">    eqimg = cv.equalizeHist(image)</span><br><span class="line">    hist = cv.calcHist([eqimg], [0], None, [32], [0,256])</span><br><span class="line">    print(hist.dtype)</span><br><span class="line">    plt.plot(hist, color = &quot;gray&quot;)</span><br><span class="line">    plt.xlim([0,32])</span><br><span class="line">    plt.show()</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="23图像卷积操作"><a href="#23图像卷积操作" class="headerlink" title="23图像卷积操作"></a>23<strong>图像卷积操作</strong></h1><p>卷积的本质是线性组合。</p><h3 id="图像卷积定义"><a href="#图像卷积定义" class="headerlink" title="图像卷积定义"></a>图像卷积定义</h3><p><img src="https://s3.bmp.ovh/imgs/2022/12/22/ed0f74cd8d8871cd.png" alt=""></p><h3 id="卷积的边缘填充"><a href="#卷积的边缘填充" class="headerlink" title="卷积的边缘填充"></a>卷积的边缘填充</h3><p>•边缘处理，边缘填充的方式</p><p>•（1）cv.BORDER_DEFAULT        gfedcb|abcdefgh|gfedcba</p><p>•（2）cv.BORDER_WRAP              cdefgh|abcdefgh|abcdefg</p><p>•（3）cv.BORDER_CONSTANT           iiiiii|abcdefgh|iiiiiii</p><h3 id="卷积模糊函数"><a href="#卷积模糊函数" class="headerlink" title="卷积模糊函数"></a>卷积模糊函数</h3><p>•<strong>cv.blur</strong>( <strong>src</strong>,<strong>ksize</strong>[,<em> <strong>dst</strong>[, anchor[,<strong>borderType</strong>]]]) -&gt;<em>*dst</em></em></p><p>​    •<strong>src</strong>表示输入图像 CV_8U, CV_32F or CV_64F*</p><p>​    •<strong>Ksize</strong>卷积核大小</p><p>​    •<strong>Anchor</strong>锚定位置（被平滑的点），默认值(-1,-1)，如果这个点坐标是负值的话，就表示取核的中心为锚点</p><p>​    •<strong>borderType</strong>边缘处理方式</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">#23图像卷积操作</span><br><span class="line">def conv_demo():</span><br><span class="line">    image = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">    dst = np.copy(image)</span><br><span class="line">    cv.imshow(&quot;input&quot;,image)</span><br><span class="line">    h, w, c = image.shape</span><br><span class="line">    for row in range(1, h-1, 1):</span><br><span class="line">        for col in range(1, w-1, 1):</span><br><span class="line">            m = cv.mean(image[row-2:row+2, col-2:col+2])</span><br><span class="line">            dst[row, col] = (int(m[0]), int(m[1]), int(m[2]))</span><br><span class="line">    cv.imshow(&quot;convolution-demo&quot;, dst)</span><br><span class="line"></span><br><span class="line">    # blured = cv.blur(image, (5,5), anchor=(-1, -1)) #修改Ksize数值调整模糊程度</span><br><span class="line">    # cv.imshow(&quot;blur-demo&quot;, blured)</span><br><span class="line"></span><br><span class="line">    cv.waitKey(0)</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="24高斯模糊"><a href="#24高斯模糊" class="headerlink" title="24高斯模糊"></a>24<strong>高斯模糊</strong></h1><h4 id="用高斯公式产生高斯卷积核"><a href="#用高斯公式产生高斯卷积核" class="headerlink" title="用高斯公式产生高斯卷积核"></a>用高斯公式产生高斯卷积核</h4><h4 id="卷积核根据高斯函数生成，权重系数不同"><a href="#卷积核根据高斯函数生成，权重系数不同" class="headerlink" title="卷积核根据高斯函数生成，权重系数不同"></a>卷积核根据高斯函数生成，权重系数不同</h4><p><img src="https://s3.bmp.ovh/imgs/2022/12/22/65c2e59e9e444c64.png" alt=""></p><h3 id="高斯函数"><a href="#高斯函数" class="headerlink" title="高斯函数"></a>高斯函数</h3><p><strong>cv.GaussianBlur(src, ksize, sigmaX[, dst[, sigmaY[, borderType]]]) -&gt;dst</strong></p><p>​    •ksize必须是正数而且是奇数（中心对称）</p><p>​    •sigmaX高斯核函数X方向标准方差</p><p>​    •sigmaY高斯核函数Y方向标准方差,默认0，表示跟sigmaX相同</p><p>​    •ksize==0表示从sigmaX计算生成ksize</p><p>​    •ksize &gt;0 表示从ksize计算生成sigmaX，此时无视signaX所给的值</p><p><img src="https://s3.bmp.ovh/imgs/2022/12/22/ceeff837bc6b4300.png" alt=""></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">#24高斯模糊</span><br><span class="line">def gaussian_blur_demo():</span><br><span class="line">    image = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">    cv.imshow(&quot;input&quot;,image)</span><br><span class="line">    g1 = cv.GaussianBlur(image, (0, 0), 15)</span><br><span class="line">    g2 = cv.GaussianBlur(image, (15, 15), 15)</span><br><span class="line">    cv.imshow(&quot;GaussianBlur-demo1&quot;, g1)</span><br><span class="line">    cv.imshow(&quot;GaussianBlur-demo2&quot;, g2)</span><br><span class="line"></span><br><span class="line">    cv.waitKey(0)</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;21图像直方图&quot;&gt;&lt;a href=&quot;#21图像直方图&quot; class=&quot;headerlink&quot; title=&quot;21图像直方图&quot;&gt;&lt;/a&gt;21&lt;strong&gt;图像直方图&lt;/strong&gt;&lt;/h1&gt;&lt;h3 id=&quot;图像直方图函数&quot;&gt;&lt;a href=&quot;#图像直方图函数&quot; </summary>
      
    
    
    
    
    <category term="CV" scheme="https://yang-makabaka.github.io/tags/CV/"/>
    
  </entry>
  
  <entry>
    <title>OpenCV——5</title>
    <link href="https://yang-makabaka.github.io/posts/5bec73ea.html"/>
    <id>https://yang-makabaka.github.io/posts/5bec73ea.html</id>
    <published>2022-12-20T11:52:49.000Z</published>
    <updated>2022-12-24T13:39:51.737Z</updated>
    
    <content type="html"><![CDATA[<h1 id="17鼠标响应与操作"><a href="#17鼠标响应与操作" class="headerlink" title="17鼠标响应与操作"></a>17鼠标响应与操作</h1><p><img src="https://s3.bmp.ovh/imgs/2022/12/20/0aa42ce71379b7b7.png" alt=""></p><p><img src="https://s3.bmp.ovh/imgs/2022/12/20/993a2335eb6f2248.png" alt=""></p><h4 id="•回调函数参数-int-event-int-x-int-y-int-flags-void-userdata"><a href="#•回调函数参数-int-event-int-x-int-y-int-flags-void-userdata" class="headerlink" title="•回调函数参数: int event, int x, int y, int flags, void **userdata*"></a>•回调函数参数: <em>int</em> <em>event,</em> <em>int</em> <em>x,</em> <em>int</em> <em>y,</em> <em>int</em> <em>flags, void **</em>userdata*</h4><p>•<strong>Event</strong>表示鼠标事件</p><p>•<strong>(x, y)</strong>表示当前鼠标位置</p><p>•<strong>Flags</strong>表示鼠标状态</p><p>•<strong>Userdata</strong>表示回调用户数据，可以为空</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line">b1 = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">img = np.copy(b1)</span><br><span class="line">x1 = -1</span><br><span class="line">y1 = -1</span><br><span class="line">x2 = -1</span><br><span class="line">y2 = -1</span><br><span class="line">def mouse_drawing(event, x, y, flags, param):</span><br><span class="line">    global x1, y1, x2, y2</span><br><span class="line">    if event == cv.EVENT_LBUTTONDOWN:</span><br><span class="line">        x1 = x</span><br><span class="line">        y1 = y</span><br><span class="line">    if event == cv.EVENT_MOUSEMOVE:</span><br><span class="line">        if x1 &lt;0 or y1 &lt;0:</span><br><span class="line">            return</span><br><span class="line">        x2 = x</span><br><span class="line">        y2 = y</span><br><span class="line">        dx = x2 - x1</span><br><span class="line">        dy = y2 - y1</span><br><span class="line">        if dx &gt; 0 and dy &gt; 0:</span><br><span class="line">            b1[:,:,:] = img[:,:,:]</span><br><span class="line">            cv.rectangle(b1, (x1, y1), (x2, y2), (0, 0, 255), 2, 8, 0)</span><br><span class="line">    if event == cv.EVENT_LBUTTONUP:</span><br><span class="line">        x2 = x</span><br><span class="line">        y2 = y</span><br><span class="line">        dx = x2 - x1</span><br><span class="line">        dy = y2 - y1</span><br><span class="line">        if dx &gt; 0 and dy &gt; 0:</span><br><span class="line">            b1[:,:,:] = img[:,:,:]</span><br><span class="line">            cv.rectangle(b1, (x1, y1), (x2, y2), (0, 0, 255), 2, 8, 0)</span><br><span class="line">        x1 = -1</span><br><span class="line">        x2 = -1</span><br><span class="line">        y1 = -1</span><br><span class="line">        y2 = -1</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">def mouse_demo():</span><br><span class="line">    cv.namedWindow(&quot;mouse_demo&quot;,cv.WINDOW_AUTOSIZE)</span><br><span class="line">    cv.setMouseCallback(&quot;mouse_demo&quot;,mouse_drawing)</span><br><span class="line">    while True:</span><br><span class="line">        cv.imshow(&quot;mouse_demo&quot;,b1)</span><br><span class="line">        c = cv.waitKey(10)</span><br><span class="line">        if c == 27:</span><br><span class="line">            break</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="18图像像素类型转换与归一化"><a href="#18图像像素类型转换与归一化" class="headerlink" title="18图像像素类型转换与归一化"></a>18<strong>图像像素类型转换与归一化</strong></h1><p><img src="https://s3.bmp.ovh/imgs/2022/12/20/e02e0be369644a53.png" alt=""></p><h4 id="归一化函数"><a href="#归一化函数" class="headerlink" title="归一化函数"></a>归一化函数</h4><p>•cv.normalize( src, dst[, alpha[, beta[, norm_type[, dtype[, mask]]]]] ) -&gt; dst</p><p>•src表示输入图像, dst表示输出图像</p><p>•alpha, beta 默认是1， 0，是归一化的区间值</p><p>•norm_type默认是NORM_L2,</p><p>•norm_type常用是NORM_MINMAX</p><p>Imread读入默认是uint8, 转换为float32,通过imshow显示之前，必须归一化到[0~1]之间。</p><p>把float32的归一化图像转换为uint8类型：np.uint8(image*255)</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line">#归一化</span><br><span class="line">def trackbar_callback(pos):</span><br><span class="line">    print(pos)</span><br><span class="line"></span><br><span class="line">def norm_demo():</span><br><span class="line">    image_unit8 = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">    cv.imshow(&quot;image_uint8&quot;,image_unit8)</span><br><span class="line">    img_f32 = np.float32(image_unit8)</span><br><span class="line">    # cv.imshow(&quot;img_f32&quot;,img_f32)</span><br><span class="line">    # cv.normalize(img_f32, img_f32, 1, 0, cv.NORM_MINMAX)</span><br><span class="line">    # cv.imshow(&quot;norm-img_f32&quot;,img_f32)</span><br><span class="line">    # cv.waitKey(0)</span><br><span class="line">    # cv.destroyAllWindows()</span><br><span class="line"></span><br><span class="line">    cv.namedWindow(&quot;norm_demo&quot;,cv.WINDOW_AUTOSIZE)</span><br><span class="line">    cv.createTrackbar(&quot;mormtype&quot;, &quot;norm_demo&quot;, 10, 3, trackbar_callback)</span><br><span class="line">    while True:</span><br><span class="line">        dst = np.float32(image_unit8)</span><br><span class="line">        pos = cv.getTrackbarPos(&quot;normtype&quot;,&quot;norm-demo&quot;)</span><br><span class="line">        if pos == 0:</span><br><span class="line">            cv.normalize(dst, dst, 1, 0, cv.NORM_MINMAX)</span><br><span class="line">        if pos == 1:</span><br><span class="line">            cv.normalize(dst, dst, 1, 0, cv.NORM_L1)</span><br><span class="line">        if pos == 2:</span><br><span class="line">            cv.normalize(dst, dst, 1, 0, cv.NORM_L2)</span><br><span class="line">        if pos == 3:</span><br><span class="line">            cv.normalize(dst, dst, 1, 0, cv.NORM_INF)</span><br><span class="line">        cv.imshow(&quot;norm-demo&quot;,img_f32)</span><br><span class="line">        c = cv.waitKey(50)</span><br><span class="line">        if c == 27:</span><br><span class="line">            break</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="19图像几何变换"><a href="#19图像几何变换" class="headerlink" title="19图像几何变换"></a>19<strong>图像几何变换</strong></h1><p>•cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] ) -&gt; dst</p><p><img src="https://s3.bmp.ovh/imgs/2022/12/20/d506a88ea5e4090d.png" alt=""></p><p>•src表示输入图像</p><p>•M 表示2x3变换矩阵</p><p>•dsize表示目标图像dst的大小</p><p>•支持平移变换、放缩变换、旋转变换</p><p><img src="https://s3.bmp.ovh/imgs/2022/12/20/640c93c7d43f08e6.png" alt=""></p><p><img src="https://s3.bmp.ovh/imgs/2022/12/20/cb7ad0f5308870fa.png" alt=""></p><p><img src="https://s3.bmp.ovh/imgs/2022/12/20/a9105cbaf0538d38.png" alt=""></p><h3 id="获取旋转矩阵"><a href="#获取旋转矩阵" class="headerlink" title="获取旋转矩阵"></a>获取旋转矩阵</h3><p>•旋转矩阵获取cv.getRotationMatrix2D</p><p>•Center表示旋转中心, angle表示度数，大于零表示逆时针旋转, scale表示放缩尺度大小。</p><h3 id="翻转与特殊角度旋转"><a href="#翻转与特殊角度旋转" class="headerlink" title="翻转与特殊角度旋转"></a>翻转与特殊角度旋转</h3><p>•cv.flip(src, flipCode[, dst] ) -&gt;dst</p><p>•cv.rotate(src, rotateCode[, dst] ) -&gt; dst</p><p>•src表示输入图像</p><p>•flipCode支持0水平、1垂直，-1对角线翻转，</p><p>•rotateCode支持旋转90°，180°，270°</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">#图像几何变换</span><br><span class="line">def affine_demo():</span><br><span class="line">    image = cv.imread(&quot;123.jpg&quot;)</span><br><span class="line">    h, w, c = image.shape</span><br><span class="line">    cx = int(w / 2)</span><br><span class="line">    cy = int(h / 2)</span><br><span class="line">    cv.imshow(&quot;image&quot;,image)</span><br><span class="line"></span><br><span class="line">    M = np.zeros((2,3), dtype=np.float32)</span><br><span class="line">    M[0, 0] = .7</span><br><span class="line">    M[1, 1] = .7</span><br><span class="line">    M[0, 2] = 0</span><br><span class="line">    M[1, 2] = 0</span><br><span class="line">    print(&quot;M(2x3) = \n&quot;, M)</span><br><span class="line">    dst = cv.warpAffine(image, M, (int(w*.7), int(h*.7)))</span><br><span class="line">    cv.imshow(&quot;rescale-demo&quot;,dst)</span><br><span class="line">    cv.imwrite(&quot;result.png&quot;,dst)</span><br><span class="line"></span><br><span class="line">    #获取旋转矩阵，degree &gt; 0 表示逆时针旋转，原点在左上角</span><br><span class="line">    M = cv.getRotationMatrix2D((w/2, h/2), 45.0, 1.0)</span><br><span class="line">    dst = cv.warpAffine(image, M, (w,h))</span><br><span class="line">    cv.imshow(&quot;rotate-demo&quot;,dst)</span><br><span class="line"></span><br><span class="line">    dst = cv.flip(image, 0)</span><br><span class="line">    cv.imshow(&quot;flip-demo&quot;,dst)</span><br><span class="line"></span><br><span class="line">    cv.waitKey(0)</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="20视频读写处理"><a href="#20视频读写处理" class="headerlink" title="20视频读写处理"></a>20<strong>视频读写处理</strong></h1><h3 id="视频标准与格式"><a href="#视频标准与格式" class="headerlink" title="视频标准与格式"></a>视频标准与格式</h3><p>•SD(Standard Definition)标清480P</p><p>•HD(High Definition)高清720P/1080P</p><p>•UHD(Ultra High Definition)超高清4K/2160P</p><p>•分辨率表示</p><p>•SD-640x480, 704x480, 720x480, 848x480等</p><p>•HD-960x720,1280x720,1440x1080,1920x1080</p><p>•UHD-4K,2160P</p><h3 id="视频读取函数"><a href="#视频读取函数" class="headerlink" title="视频读取函数"></a>视频读取函数</h3><p><code>cv.VideoCapture ( filename, index, apiPreference)</code></p><p>•filename表示视频文件</p><p>•Index表示USB摄像头或者web camera的索引</p><p>•apiPreference = CAP_ANY意思自动决定第三方视频库如： cv.CAP_FFMPEG， cv.CAP_DSHOW</p><h3 id="查询视频属性"><a href="#查询视频属性" class="headerlink" title="查询视频属性"></a>查询视频属性</h3><p>•VideoCaput的get方法</p><p>•cv.CAP_PROP_FRAME_WIDT</p><p>•cv.CAP_PROP_FRAME_HEIGHT</p><p>•cv.CAP_PROP_FPS（对视频流来说是0）</p><p>•cv.CAP_PROP_FOURCC</p><p>•cv.CAP_PROP_FRAME_COUNT</p><h3 id="视频文件保存"><a href="#视频文件保存" class="headerlink" title="视频文件保存"></a>视频文件保存</h3><p>cv.VideoWriter( </p><p>​    filename, 保存文件名称</p><p>​    fourcc, 编码方式</p><p>​    fps, 帧率</p><p>​    frameSize 视频帧大小，<strong>与实现大小相符</strong>，否则无法保存</p><p>​    [, isColor] )</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">#20视频读写处理</span><br><span class="line">def video_demo():</span><br><span class="line">    cap = cv.VideoCapture(&quot;456.avi&quot;)</span><br><span class="line">    # query video file metadata</span><br><span class="line">    fps = cap.get(cv.CAP_PROP_FPS)</span><br><span class="line">    frame_w = cap.get(cv.CAP_PROP_FRAME_WIDTH)</span><br><span class="line">    frame_h = cap.get(cv.CAP_PROP_FRAME_HEIGHT)</span><br><span class="line">    print(fps, frame_w, frame_h)</span><br><span class="line">    # encode mode</span><br><span class="line">    # fourcc =cv.VideoWriter_fourcc(*&#x27;vp09&#x27;)</span><br><span class="line">    fourcc = cap.get(cv.CAP_PROP_FOURCC)</span><br><span class="line">    # create Video writer</span><br><span class="line">    writer = cv.VideoWriter(&#x27;output.mp4&#x27;, int(fourcc), fps, (int(frame_w), int(frame_h)))</span><br><span class="line">    # loop read frame until last frame</span><br><span class="line">    while True:</span><br><span class="line">        ret, frame = cap.read()</span><br><span class="line">        if ret is not True:</span><br><span class="line">            break</span><br><span class="line">        hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)</span><br><span class="line">        cv.imshow(&quot;hsv&quot;,hsv)</span><br><span class="line">        cv.imshow(&quot;frame&quot;,frame)</span><br><span class="line">        c = cv.waitKey(1)</span><br><span class="line">        if c == 27:</span><br><span class="line">            break</span><br><span class="line">        writer.write(frame)</span><br><span class="line"></span><br><span class="line">    # release camera resource</span><br><span class="line">    cap.release()</span><br><span class="line">    writer.release()</span><br></pre></td></tr></table></figure>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;17鼠标响应与操作&quot;&gt;&lt;a href=&quot;#17鼠标响应与操作&quot; class=&quot;headerlink&quot; title=&quot;17鼠标响应与操作&quot;&gt;&lt;/a&gt;17鼠标响应与操作&lt;/h1&gt;&lt;p&gt;&lt;img src=&quot;https://s3.bmp.ovh/imgs/2022/12/</summary>
      
    
    
    
    
    <category term="CV" scheme="https://yang-makabaka.github.io/tags/CV/"/>
    
  </entry>
  
  <entry>
    <title>OpenCV——4</title>
    <link href="https://yang-makabaka.github.io/posts/2ceb437c.html"/>
    <id>https://yang-makabaka.github.io/posts/2ceb437c.html</id>
    <published>2022-12-18T13:19:22.000Z</published>
    <updated>2022-12-24T13:40:02.274Z</updated>
    
    <content type="html"><![CDATA[<h1 id="13图像统计信息"><a href="#13图像统计信息" class="headerlink" title="13图像统计信息"></a>13图像统计信息</h1><h4 id="像素值统计-均值"><a href="#像素值统计-均值" class="headerlink" title="像素值统计-均值"></a>像素值统计-均值</h4><p><code>•cv.mean(src[, mask] ) -&gt;retval</code></p><h4 id="像素值统计-方差"><a href="#像素值统计-方差" class="headerlink" title="像素值统计-方差"></a>像素值统计-方差</h4><p><code>•cv.meanStdDev(src[, mean[, stddev[, mask]]]) -&gt;mean, stddev</code></p><h4 id="像素值统计-极值"><a href="#像素值统计-极值" class="headerlink" title="像素值统计-极值"></a>像素值统计-极值</h4><p><code>•cv.minMaxLoc(src[, mask]) -&gt;minVal, maxVal, minLoc, maxLoc</code></p><h5 id="•src表示输入图像-mask表示计算区域"><a href="#•src表示输入图像-mask表示计算区域" class="headerlink" title="•src表示输入图像,mask表示计算区域"></a>•src表示输入图像,mask表示计算区域</h5><h5 id="•mean-stddev-minVal-maxVal分别表示均值，标准方差，最小与最大"><a href="#•mean-stddev-minVal-maxVal分别表示均值，标准方差，最小与最大" class="headerlink" title="•mean, stddev, minVal, maxVal分别表示均值，标准方差，最小与最大"></a>•mean, stddev, minVal, maxVal分别表示均值，标准方差，最小与最大</h5><h5 id="•cv2-convertScaleAbs-函数通过线性变换将数据转为均值，然后转换成8位-uint8"><a href="#•cv2-convertScaleAbs-函数通过线性变换将数据转为均值，然后转换成8位-uint8" class="headerlink" title="•cv2.convertScaleAbs()函数通过线性变换将数据转为均值，然后转换成8位[uint8]"></a>•cv2.convertScaleAbs()函数通过线性变换将数据转为均值，然后转换成8位[uint8]</h5><h4 id="每个通道分别计算均值和方差"><a href="#每个通道分别计算均值和方差" class="headerlink" title="每个通道分别计算均值和方差"></a>每个通道分别计算均值和方差</h4><h4 id="通过图像方差判断是否含有有效信息"><a href="#通过图像方差判断是否含有有效信息" class="headerlink" title="通过图像方差判断是否含有有效信息"></a>通过图像方差判断是否含有有效信息</h4><h4 id="调整图像对比度的本质是调整图像之间的差值"><a href="#调整图像对比度的本质是调整图像之间的差值" class="headerlink" title="调整图像对比度的本质是调整图像之间的差值"></a>调整图像对比度的本质是调整图像之间的差值</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">import cv2</span><br><span class="line">import numpy as np</span><br><span class="line"></span><br><span class="line">image = cv2.imread(&quot;123.jpg&quot;)</span><br><span class="line">cv2.imshow(&quot;demo1&quot;,image)</span><br><span class="line">bgr_m = cv2.mean(image)</span><br><span class="line">sub_m = np.float32(image)[:, :] - (bgr_m[0],bgr_m[1],bgr_m[2])</span><br><span class="line">result = sub_m * 0.5</span><br><span class="line">result = result[:, :] + (bgr_m[0],bgr_m[1],bgr_m[2])</span><br><span class="line">cv2.imshow(&quot;低对比度&quot;,cv2.convertScaleAbs(result))</span><br><span class="line"></span><br><span class="line"># result = sub_m *  2.0</span><br><span class="line"># result = result[:, :] + (bgr_m[0],bgr_m[1],bgr_m[2])</span><br><span class="line"># cv2.imshow(&quot;高对比度&quot;,cv2.convertScaleAbs(result))</span><br><span class="line"></span><br><span class="line">cv2.waitKey(0)</span><br><span class="line">cv2.destroyAllWindows()</span><br></pre></td></tr></table></figure><h1 id="14图像几何形状绘制"><a href="#14图像几何形状绘制" class="headerlink" title="14图像几何形状绘制"></a>14<strong>图像几何形状绘制</strong></h1><p>•支持绘制线、矩形、圆形</p><p>•支持填充矩形、圆形、椭圆</p><p>•支持绘制文本（不支持中文）</p><p>•相关函数<em>cv.line()<strong>、</strong>cv.circle<strong>()</strong>、<strong>cv.rectangle</strong>()<strong>、</strong>cv.ellipse<strong>()</strong>、<strong>cv.putText</strong>()</em></p><p>•相关参数解释：</p><p><strong>•img</strong>表示输入图像</p><p><strong>•color</strong>表示颜色，如(255, 0,0)表示蓝色（必须与img的通道匹配）</p><p><strong>•thickness</strong>表示线宽, 大于0表示绘制，小于0表示填充</p><p><strong>•lineType</strong>表示渲染模式, 默认LINE_8（渲染周围8个点即8连通像素，性能有限使用）, LINE_AA表示反锯齿（质量更高）</p><h4 id="文本绘制"><a href="#文本绘制" class="headerlink" title="文本绘制"></a>文本绘制</h4><p>•<strong>putText</strong> 默认只支持英文</p><p><strong>•org</strong>表示文字起始坐标点</p><p><strong>•fontFace</strong>表示字体类型</p><p><strong>•fontScale</strong>表示字体大小</p><h4 id="计算文本区域大小"><a href="#计算文本区域大小" class="headerlink" title="计算文本区域大小"></a>计算文本区域大小</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">#函数计算文本区域大小函数</span><br><span class="line">getTextSize(</span><br><span class="line"></span><br><span class="line">text, # 表示文本信息</span><br><span class="line"></span><br><span class="line">fontFace, # 表示字体类型</span><br><span class="line"></span><br><span class="line">fontScale, # 表示字体大小</span><br><span class="line"></span><br><span class="line">thickness # 表示线宽</span><br><span class="line">) </span><br><span class="line">#返回文本信息区域大小，与字体的基线baseline位置</span><br></pre></td></tr></table></figure><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">def paint():</span><br><span class="line">    canvas = np.zeros((512,512,3),dtype=np.uint8)</span><br><span class="line"></span><br><span class="line">    #动态合理显示文本区域</span><br><span class="line">    font_color = (140,199,0) #框颜色</span><br><span class="line">    cv.rectangle(canvas,(100,100),(300,300),font_color,2,8) #框</span><br><span class="line"></span><br><span class="line">    label_txt = &quot;OpenCV&quot;</span><br><span class="line">    font = cv.FONT_HERSHEY_SIMPLEX  #字体</span><br><span class="line">    font_scale = 0.5  #字体大小</span><br><span class="line">    thickness = 1     #线宽</span><br><span class="line">    (fw, uph),dh = cv.getTextSize(label_txt,font,font_scale,thickness)</span><br><span class="line">    cv.rectangle(canvas,(100,100-uph-dh),(100+fw,100),(255,255,255),-1,8)</span><br><span class="line">    cv.putText(canvas,label_txt,(100,100-dh),font,font_scale,(255,0,255),thickness)</span><br><span class="line">    cv.imshow(&quot;canvas&quot;,canvas)</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line"></span><br><span class="line">    cv.waitKey(0)</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure><h4 id="rectangle函数"><a href="#rectangle函数" class="headerlink" title="rectangle函数"></a>rectangle函数</h4><p>cv2.rectangle(img, pt1, pt2, color, thickness, lineType, shift )</p><p>参数表示依次为：(图片，长方形框左上角坐标, 长方形框右下角坐标， 字体颜色，字体粗细）</p><p>在图片img上画长方形，坐标原点是图片左上角，向右为x轴正方向，向下为y轴正方向。左上角（x，y），右下角（x，y） ，颜色(B,G,R), 线的粗细</p><h1 id="15随机数与随机颜色"><a href="#15随机数与随机颜色" class="headerlink" title="15随机数与随机颜色"></a>15<strong>随机数与随机颜色</strong></h1><h4 id="Numpy随机数"><a href="#Numpy随机数" class="headerlink" title="Numpy随机数"></a>Numpy随机数</h4><p>•random.randint(low, high=None, size=None, dtype=int)</p><p>•Low表低值，high表示高值，size表示维度，dtype表示类型</p><p>•np.random.randint(256)</p><p>•np.random.randint(0, 256)</p><p>•表示产生0~255随机数，类型是int</p><p>•np.random.randint(0, 256, size=3) #size表示生成随机数的数量，用数组表示</p><h4 id="随机噪声图"><a href="#随机噪声图" class="headerlink" title="随机噪声图"></a>随机噪声图</h4><p>•cv.randn(dst, mean, stddev)</p><p>•生成目标图像dst</p><p>•噪声均值mean</p><p>•噪声方差stddev</p><p>•cv.randn(canvas, (40, 200, 140), (10, 50, 10)) #参数：图像，均值，方差</p><h4 id="代码演示"><a href="#代码演示" class="headerlink" title="代码演示"></a>代码演示</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">def rando():</span><br><span class="line">    canvas = np.zeros((512, 512, 3), dtype=np.uint8)</span><br><span class="line">    # random draw</span><br><span class="line">    while True:</span><br><span class="line">        b,g,r = np.random.randint(0, 256, size=3)</span><br><span class="line">        x1 = np.random.randint(0, 512)</span><br><span class="line">        x2 = np.random.randint(0, 512)</span><br><span class="line">        y1 = np.random.randint(0, 512)</span><br><span class="line">        y2 = np.random.randint(0, 512)</span><br><span class="line">        cv.rectangle(canvas,(x1,y1), (x2, y2), (int (b), int(g), int (r)), -1, 8)</span><br><span class="line">        cv.imshow( &quot;canvas&quot;,canvas)</span><br><span class="line">        c = cv.waitKey(50)</span><br><span class="line">        if c == 27:</span><br><span class="line">            break</span><br><span class="line">        cv.rectangle(canvas, (0,0), (512, 512), (0, 0, 0),-1, 8)</span><br></pre></td></tr></table></figure><h1 id="16多边形填充与绘制"><a href="#16多边形填充与绘制" class="headerlink" title="16多边形填充与绘制"></a>16多边形填充与绘制</h1><h4 id="绘制函数"><a href="#绘制函数" class="headerlink" title="绘制函数"></a>绘制函数</h4><p>•cv.fillPoly(img, pts, color[, lineType[, shift[, offset]]]) -&gt;img</p><p>•填充多边形</p><p>•cv.polylines(img, pts, isClosed, color[, <em>thickness</em>[, lineType[, shift]]] ) -&gt;img</p><p>•绘制多边形</p><p>•pts表示一个或者多个点集，polylines支持一次绘制多个多边形</p><p>•color表示颜色</p><p>•thickness表示线宽，<strong>注意：</strong>必须大于0</p><p>•lineType 表示渲染方式</p><h4 id="点集支持"><a href="#点集支持" class="headerlink" title="点集支持"></a>点集支持</h4><p>•pts表示一个或者多个点集</p><p>•pts = []</p><p>•pts.append((100, 100))</p><p>•pts.append((200, 50))</p><p>•pts.append((280, 100))</p><p>•pts.append((290, 300))</p><p>•pts.append((50, 300))</p><p>•pts = np.asarray(pts, dtype=np.int32)</p><p>•print(pts.shape)</p><p><strong>•要求：必须是CV_32S, 对应np.int32</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line">def paintmore():</span><br><span class="line">    canvas = np.zeros((512, 512, 3), dtype=np.uint8)</span><br><span class="line">    pts =[]</span><br><span class="line">    pts.append((100, 100))</span><br><span class="line">    pts.append((200, 50))</span><br><span class="line">    pts.append((280, 100))</span><br><span class="line">    pts.append((290, 300))</span><br><span class="line">    pts.append((50, 300))</span><br><span class="line">    pts = np.asarray(pts, dtype=np.int32) #必须是int32</span><br><span class="line">    print(pts.shape)</span><br><span class="line"></span><br><span class="line">    pts2 = []</span><br><span class="line">    pts2.append((300, 300))</span><br><span class="line">    pts2.append((400, 250))</span><br><span class="line">    pts2.append((500, 300))</span><br><span class="line">    pts2.append((500, 500))</span><br><span class="line">    pts2.append((250, 500))</span><br><span class="line">    pts2 = np.asarray(pts2, dtype=np.int32)</span><br><span class="line">    print(pts2.shape)</span><br><span class="line"></span><br><span class="line">    cv.polylines(canvas, [pts, pts2], True, (0, 0, 255), 2, 8)</span><br><span class="line">    cv.fillPoly(canvas, [pts, pts2], (255, 0, 0), 8, 0)</span><br><span class="line">    cv.imshow(&quot;poly-demo&quot;, canvas)</span><br><span class="line">    cv.waitKey(0)</span><br><span class="line">    cv.destroyAllWindows()</span><br></pre></td></tr></table></figure>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;13图像统计信息&quot;&gt;&lt;a href=&quot;#13图像统计信息&quot; class=&quot;headerlink&quot; title=&quot;13图像统计信息&quot;&gt;&lt;/a&gt;13图像统计信息&lt;/h1&gt;&lt;h4 id=&quot;像素值统计-均值&quot;&gt;&lt;a href=&quot;#像素值统计-均值&quot; class=&quot;head</summary>
      
    
    
    
    
    <category term="CV" scheme="https://yang-makabaka.github.io/tags/CV/"/>
    
  </entry>
  
  <entry>
    <title>OpenCV——3</title>
    <link href="https://yang-makabaka.github.io/posts/b28fd6df.html"/>
    <id>https://yang-makabaka.github.io/posts/b28fd6df.html</id>
    <published>2022-12-16T14:59:34.000Z</published>
    <updated>2022-12-24T13:40:12.100Z</updated>
    
    <content type="html"><![CDATA[<h2 id="09滚动条操作"><a href="#09滚动条操作" class="headerlink" title="09滚动条操作"></a>09滚动条操作</h2><h4 id="Callback回调基本流程"><a href="#Callback回调基本流程" class="headerlink" title="Callback回调基本流程"></a>Callback回调基本流程</h4><p><img src="https://s3.bmp.ovh/imgs/2022/12/16/f5830e4e8f2ee9c9.png" alt=""></p><p>引用举例：你到一个商店买东西，刚好你要的东西没有货，于是你在店员那里留下了你的电话，过了几天店里有货了，店员就打了你的电话，然后你接到电话后就到店里去取了货。在这个例子里，你的电话号码就叫回调函数，你把电话留给店员就叫登记回调函数，店里后来有货了叫做触发了回调关联的事件，店员给你打电话叫做调用回调函数，你到店里去取货叫做响应回调事件。回答完毕。 (链接：<a href="https://www.zhihu.com/question/19801131">https://www.zhihu.com/question/19801131</a>)</p><p>先注册后使用</p><h4 id="事件响应函数"><a href="#事件响应函数" class="headerlink" title="事件响应函数"></a>事件响应函数</h4><p>•<code>typedef void(* cv::TrackbarCallback) (int pos//滑块位置, void *userdata//用户数据，可不写)</code></p><p>•完成事件响应函数的声明与实现</p><p><code>•def trackbar_callback (pos):</code></p><p>​    <code>print(pos)</code></p><h4 id="创建窗口函数"><a href="#创建窗口函数" class="headerlink" title="创建窗口函数"></a>创建窗口函数</h4><p>•<code>cv.namedWindow(winname [, flags]) -&gt; None</code></p><p>•参数: winname表示窗口标题</p><p>•参数flags支持的flag有：</p><p>​    WINDOW_NORMAL – 可以调整窗口大小，图片很大时使用</p><p>​    WINDOW_AUTOSIZE – 根据图像大小自动适应，不可调</p><p>​    WINDOW_KEEPRATIO – 可以保持比例窗口，调整大小</p><h4 id="调整图像亮度"><a href="#调整图像亮度" class="headerlink" title="调整图像亮度"></a>调整图像亮度</h4><p>•RGB值表示亮度</p><p>•RGB(0, 0,0) 黑色 -&gt; RGB(255,255,255)白色，通过调整像素值来调整亮度</p><p>•add函数支持图像+图像与图像+常量方式</p><p>•subtract函数支持图像+图像与图像+常量方式</p><p>•动态调整，基于滚动条修改常量值，实现动态修改图像亮度并刷新显示</p><p>•创建图像窗口</p><p>•创建滚动条组件</p><p>•在窗口显示图像</p><p>•拖拉滚动条修改图像亮度</p><p><img src="https://z4a.net/images/2022/12/16/222.png" alt=""></p><h2 id="10键盘响应操作"><a href="#10键盘响应操作" class="headerlink" title="10键盘响应操作"></a>10键盘响应操作</h2><h4 id="键盘响应事件"><a href="#键盘响应事件" class="headerlink" title="键盘响应事件"></a>键盘响应事件</h4><p>•cv.waitKey( [, delay] ) -&gt;retval</p><p>​    delay如果没有声明或者delay=0,表示一直阻塞</p><p>​    delay大于0，表示阻塞指定毫秒数</p><p>​    Retval返回的对应键盘键值，注意:在不同的操作系统中可能会有差异</p><p>​    典型的retval = 27是ESC按键</p><h4 id="响应不同的键盘操作"><a href="#响应不同的键盘操作" class="headerlink" title="响应不同的键盘操作"></a>响应不同的键盘操作</h4><p>•检查返回键值，根据不同键值完成不同操作</p><p>•推荐使用if-elif-else, switch-case方式python3.10支持</p><p>if <expr>:</p>  <statement(s)><p>elif <expr>:</p>  <statement(s)><p>elif <expr>:</p>  <statement(s)><p>  …</p><p>else:</p>  <statement(s)><p><img src="https://z4a.net/images/2022/12/16/333.png" alt=""></p><p>•按ESC推出</p><p>•按1显示HSV图像</p><p>•按2显示YCrCb</p><p>•按3显示RGB图像</p><p>•按0恢复原图BGR显示</p><h2 id="11自带颜色表操作"><a href="#11自带颜色表操作" class="headerlink" title="11自带颜色表操作"></a>11自带颜色表操作</h2><p>查找表（LUT，look up table）</p><p>优势：预计算，空间换时间，避免重复计算，节约计算时间</p><h4 id="Gamma校正"><a href="#Gamma校正" class="headerlink" title="Gamma校正"></a>Gamma校正</h4><p>•公式p(x, y)表示输入图像像素值</p><p><img src="https://z4a.net/images/2022/12/16/444.png" alt=""></p><p>•像素值取值范围在0~255之间，每一个值对应一个输出值，这样映射关系，可以先建立查找表LUT</p><p>•根据输入得像素值作为index，在LUT中直接映射读取得到gamma校正之后得值</p><p>•对256x256大小的图像，计算量对比：</p><p>•不应用找表计算gamma - 65536次，</p><p>•应用查找表计算gamma – 256次</p><h4 id="OpenCV中LUT支持"><a href="#OpenCV中LUT支持" class="headerlink" title="OpenCV中LUT支持"></a>OpenCV中LUT支持</h4><p>•cv.applyColorMap(src, colormap[, dst]) -&gt;dst</p><p>•第一个参数输入图像</p><p>•第二个参数是颜色表</p><p>•dst返回图像</p><p><img src="https://z4a.net/images/2022/12/16/555.png" alt=""></p><p><img src="https://z4a.net/images/2022/12/16/666.png" alt=""></p><p>系统查找表使用cv.applyColorMap，自定义查找表使用cv.LUT</p><p>自定义colormap大小必须为256x1</p><h2 id="12通道分离与合并"><a href="#12通道分离与合并" class="headerlink" title="12通道分离与合并"></a>12通道分离与合并</h2><h4 id="通道分类与合并"><a href="#通道分类与合并" class="headerlink" title="通道分类与合并"></a>通道分类与合并</h4><p>RGB/HSV彩色通道分离为单独通道</p><p>针对不同通道使用不同阈值提取mask</p><h4 id="分离函数"><a href="#分离函数" class="headerlink" title="分离函数"></a>分离函数</h4><p>•通道分离函数cv.split(m[, mv]) -&gt;mv</p><p>•m表示输入图像,必须是多通道图像</p><p>•mv表示输出分离的单通道数组</p><h4 id="合并与混合"><a href="#合并与混合" class="headerlink" title="合并与混合"></a>合并与混合</h4><p>•cv.merge(mv[, dst])-&gt;dst</p><p>​    mv表示各个通道</p><p>•cv.mixChannels(src, dst, fromTo)-&gt;dst</p><p>​    src表示输入多通道图像</p><p>​    fromTo表示通道索引</p><p>​    dst表示返回结果</p><h4 id="通道阈值"><a href="#通道阈值" class="headerlink" title="通道阈值"></a>通道阈值</h4><p>•cv.inRange( src, lowerb, upperb[, dst]) -&gt; dst</p><p>转为二值图</p><p>•其中src是输入图像</p><p>•Lowerb是低值</p><p>•Upperb是高值</p><p>•dst = (lowerb &lt; src &lt; upperb)</p><p>范围内的为1（白色），范围外的为0（黑色）</p><p><img src="https://z4a.net/images/2022/12/16/777.png" alt=""></p>]]></content>
    
    
      
      
    <summary type="html">&lt;h2 id=&quot;09滚动条操作&quot;&gt;&lt;a href=&quot;#09滚动条操作&quot; class=&quot;headerlink&quot; title=&quot;09滚动条操作&quot;&gt;&lt;/a&gt;09滚动条操作&lt;/h2&gt;&lt;h4 id=&quot;Callback回调基本流程&quot;&gt;&lt;a href=&quot;#Callback回调基本流程&quot; cla</summary>
      
    
    
    
    
    <category term="CV" scheme="https://yang-makabaka.github.io/tags/CV/"/>
    
  </entry>
  
  <entry>
    <title>OpenCV——2</title>
    <link href="https://yang-makabaka.github.io/posts/c588e649.html"/>
    <id>https://yang-makabaka.github.io/posts/c588e649.html</id>
    <published>2022-12-14T14:19:03.000Z</published>
    <updated>2022-12-24T13:40:21.800Z</updated>
    
    <content type="html"><![CDATA[<h2 id="05-图像色彩空间转换"><a href="#05-图像色彩空间转换" class="headerlink" title="05 图像色彩空间转换"></a>05 图像色彩空间转换</h2><p><strong>常见的色彩空间：HSV、RGB、YCrCb</strong></p><p>​    RGB色彩空间，设备独立</p><p>​    HSV色彩空间，对计算机友好，区分各种色彩</p><p>​    YCrCb, Y分量表示信息，CrCb可以被压缩</p><p>​    RGB是计算机显示器的标准支持色彩系统</p><p>​    RGB的取值范围0~255</p><p>​    HSV取值范围H（色调）:0°~360°，S（饱和度）:0~255，V（明度）:0~255</p><p><strong>从一个色彩空间转换到另外一个色彩空间</strong>要考虑：</p><p>​    信息传递与损失过程、可逆与不可逆</p><p><strong>函数与参数</strong></p><p>​    cv.cvtColor(<strong>src</strong>,<strong>code</strong>[,dst[,dstCn]])-&gt;dst</p><p>​    <strong>·</strong> src表示输入图像, 类型CV_8U、CV_32F</p><p>​    <strong>·</strong> code表示， </p><p>​        cv::COLOR_BGR2RGB = 4</p><p>​        cv::COLOR_BGR2GRAY = 6</p><p>​        cv::COLOR_GRAY2BGR = 8</p><p>​        cv::COLOR_BGR2HSV = 40</p><p>​        例：img2 = cv.cvtColor(img1, cv.COLOR_BGR2GRAY)</p><p>​        注：当彩色图像转为灰度图像，由三通道转为单通道，其部分信息永久消失，再次转回BGR后图像变为三通道，但仍为灰色。</p><h2 id="06-图像对象的创建与赋值"><a href="#06-图像对象的创建与赋值" class="headerlink" title="06 图像对象的创建与赋值"></a>06 图像对象的创建与赋值</h2><p><strong>① OpenCV-Python支持的数据类型</strong>：np.uint8（默认）、np.float32（方便计算）、np.int32、np.int64</p><p><strong>② Numpy常用函数：</strong></p><p>​    numpy.array、numpy.zeros、numpy.zeros_like（快速产生与读入图像尺寸相同的纯黑图像）、numpy.asarray（将普通数组转为NumpyArray）、numpy.copy（复制图像）、numpy.reshape（各种转换）</p><p>​    </p><p>函数解释：</p><p>​        (1) numpy.array(object, dtype=None, *, copy=True, order=’K’, subok=False, ndmin=0, like=None)</p><p>​        object 数组</p><p>​        dtype 数据类型 </p><p>​        (2) numpy.zeros(shape, dtype=float, order=‘C’, *, like=None) </p><p>​        数组维度</p><p>​        dtype 数据类型</p><p>​        (3) numpy.asarray(a, dtype=None, order=None, *, like=None) </p><p>​        数组对象</p><p>​        dtype 数据类型</p><p>​        (4) numpy.reshape(a, newshape, order=’C’)</p><p>​        数组维度</p><p>​        dtype 数据类型</p><p><strong>③ 概念</strong></p><p>​    opencv-python中一切图像数据皆numpy array</p><p>​    创建图像就是创建numpy array</p><p><strong>④ 创建图像</strong></p><p>​    1）导入import numpy as np</p><p>​    2）创建np.array([[1, 2],[3, 4]], dtype=np.uint8)</p><p>​    </p><p>​    3）创建图像最常用函数：</p><p>​        np.zeros -&gt;创建一个黑色背景图像</p><p>​        np.zeros_like-&gt;创建一个与输入图像大小一致的黑色背景图像</p><p>​        np.ones创建一个全部像素值是1的图像</p><p><strong>⑤ 图像赋值</strong></p><p>​    图像赋值就是给numpy array数组赋值</p><p>​    m = np.zeros((3, 3, 3), dtype=uint8)</p><p>​    m[:] = 255，创建数组m，然后赋值为255(白色)</p><p>​    m[:] = (255,0,0)，创建数组m，然后赋值为(255,0,0)蓝色</p><p>​    h,w,c = img.shape，h,w,c分别为高，宽，通道</p><h2 id="07-图像像素的读写操作"><a href="#07-图像像素的读写操作" class="headerlink" title="07 图像像素的读写操作"></a>07 图像像素的读写操作</h2><p> <strong>理解像素：</strong></p><p>​    像素实际大小：dpi x inches = 像素总数</p><p>​    术语dpi：每英寸的点数目，96dpi – 针对打印</p><p>​    术语ppi: 每英寸的像素数目 – 针对图像分辨率</p><p><strong>OpenCV中像素</strong></p><p>​    矩阵表示每个像素信息</p><p>​    像素遍历本质就是numpy数组访问</p><p>​    <strong>假设变量image</strong></p><p>​        获取图像维度信息: image.shape</p><p>​        图像访问像素: image[row, col]</p><p>​        图像赋值像素: image[row, col] = (b,g,r)</p><p>​        读写像素，彩色图像：</p><p>​            b, g, r = image[row, col]</p><p>​            image[row, col] = (255-b, 255-g, 255-r)</p><p>​        读写像素，灰度图像：</p><p>​            pv = image[row, col]</p><p>​            image[row, col] = 255-pv</p><h2 id="08-图像算数操作"><a href="#08-图像算数操作" class="headerlink" title="08 图像算数操作"></a>08 图像算数操作</h2><p><strong>图像读取后是一个数组，它可以进行基本的算术操作</strong></p><p><strong>加</strong> cv.add(src1, src2[, dst[, mask[, dtype]]]) -&gt;dst</p><p><strong>减</strong> cv.subtract(src1,src2[,dst[,mask[,dtype]]])-&gt;dst</p><p><strong>mask</strong>参数控制操作范围，操作范围内正常加减，范围外全为0</p><p><strong>乘</strong> cv.multiply(src1,src2[,dst[,scale[,dtype]]])-&gt;dst</p><p><strong>除</strong> cv.divide(src1, src2[, dst[, scale[, dtype]]])-&gt;dst</p><p><strong>参数说明</strong> src1 &amp; src2表示图像</p><p><strong>加法运算保证不越界的方法</strong>：saturate(src1 + src2)-》0~255。saturate_cast函数的作用即是：当运算完之后，结果为负，则转为0，结果超出255，则为255。</p><p><strong>图像算术运算要求</strong>：图像大小通道数目一致</p><p><strong>加权加法</strong>：added_wt_img = cv2.addWeighted(img1, 0.6, img2, 0.4, 0)</p><p><strong>mask</strong>表示模板（蒙版），为矩阵形式，矩阵中0，表示不取该位置的值，1表示保留该位置的值。</p>]]></content>
    
    
      
      
    <summary type="html">&lt;h2 id=&quot;05-图像色彩空间转换&quot;&gt;&lt;a href=&quot;#05-图像色彩空间转换&quot; class=&quot;headerlink&quot; title=&quot;05 图像色彩空间转换&quot;&gt;&lt;/a&gt;05 图像色彩空间转换&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;常见的色彩空间：HSV、RGB、YCrCb&lt;/str</summary>
      
    
    
    
    
    <category term="CV" scheme="https://yang-makabaka.github.io/tags/CV/"/>
    
  </entry>
  
  <entry>
    <title>OpenCV——1</title>
    <link href="https://yang-makabaka.github.io/posts/5c81b7f3.html"/>
    <id>https://yang-makabaka.github.io/posts/5c81b7f3.html</id>
    <published>2022-12-12T13:19:26.000Z</published>
    <updated>2022-12-24T13:40:32.770Z</updated>
    
    <content type="html"><![CDATA[<h1 id="一、认识计算机视觉"><a href="#一、认识计算机视觉" class="headerlink" title="一、认识计算机视觉"></a>一、认识计算机视觉</h1><h2 id="1-发展历史"><a href="#1-发展历史" class="headerlink" title="1.发展历史"></a>1.发展历史</h2><p>•最早追溯到老子小孔成像</p><p>•现代1966年MIT的马文·明斯基的学生实现PC链接摄像机，标志计算机视觉作为一门学科开始发展</p><p>•1982.马尔文发布《视觉》标志着CV正式成为一门学科</p><p>•1999.David Lowe 发表SIFT特征相关论文，OpenCV收录使用</p><p>•2001.V&amp;J发表基于HAAR特征的实时人脸检测算法</p><p>•2005.HOG提出特征提取的行人检测算法</p><p>•2006.Pascal VOC数据集发布</p><p>•2012.AlexNet模型赢得ImageNet图像分类比赛冠军，展现出深度学习在CV领域的应用前景</p><p>•未来世界离不开CV</p><h2 id="2-主要任务"><a href="#2-主要任务" class="headerlink" title="2.主要任务"></a>2.主要任务</h2><p>早期主要研究领域为<u>重建</u></p><p>2012后，受深度学习影响<u>重建</u>与<u>感知</u>快速发展</p><p>目标：通过图灵测试</p><h2 id="3-应用场景"><a href="#3-应用场景" class="headerlink" title="3.应用场景"></a>3.应用场景</h2><p>•自动驾驶/辅助驾驶</p><p>•计算机视觉-AI + 机构/工业质检检测</p><p>…</p><p>•形成全场景的行业应用</p><h1 id="二、计算机视觉框架"><a href="#二、计算机视觉框架" class="headerlink" title="二、计算机视觉框架"></a>二、计算机视觉框架</h1><h2 id="1-计算机视觉框架"><a href="#1-计算机视觉框架" class="headerlink" title="1.计算机视觉框架"></a>1.计算机视觉框架</h2><p>•Matlab . 追溯到1970年 . 支持图像处理</p><p>•Matrox mil . 1993年发布第一个版本</p><p>•Halcon . 追溯到1996 . CV领域应用最多，主流框架</p><p>•OpenCV . 1999启动，2006发布1.0版本 . 开源</p><p>•VisionPro . 2009年发布</p><p>传统计算机视觉框架</p><p>•SimpleCV</p><p>•BoofCV</p><p>•Dlib</p><p>•JavaCV</p><p>深度学习计算机视觉（训练）框架</p><p>•Caffe</p><p>•Tensorflow</p><p>•Pytorch</p><p>•Paddlepaddle</p><p>•Keras</p><p>深度学习计算机视觉（部署）框架</p><p>•OpenVINO</p><p>•TensorRT</p><p>•onnxruntime</p><p>•Deepface</p><p>•YOLO/DarkNet</p><p>•mmdetection</p><p>•Paddle-detection/seg/ocr</p><h2 id="2-当前主流框架"><a href="#2-当前主流框架" class="headerlink" title="2.当前主流框架"></a>2.当前主流框架</h2><p>•机器视觉方向-Halcon/VisionPro/Mil/OpenCV</p><p>•深度学习方向-tensorflow/pytorch/paddlepaddle + openvino/tensorRT/onnxruntime</p><p>•主流语言Python/C++</p><h2 id="3-计算机视觉框架的未来趋势"><a href="#3-计算机视觉框架的未来趋势" class="headerlink" title="3.计算机视觉框架的未来趋势"></a>3.计算机视觉框架的未来趋势</h2><p>•低代码平台流行趋势明显</p><p>•传统视觉跟深度学习整合趋势明显</p><p>•算法设计流程化/可视化</p><p>•算法模块易用性跟通用性</p><p>•计算资源异构化支持趋势</p><p>•深度学习模型训练简捷化</p><h1 id="三、OpenCV"><a href="#三、OpenCV" class="headerlink" title="三、OpenCV"></a>三、OpenCV</h1><p>•github: <a href="https://github.com/opencv">https://github.com/opencv</a></p><p>•Tutorial: <a href="https://docs.opencv.org/4.5.5/index.html">https://docs.opencv.org/4.5.5/index.html</a></p><h2 id="1-发展历史-1"><a href="#1-发展历史-1" class="headerlink" title="1.发展历史"></a>1.发展历史</h2><p>•OpenCV在1999年的开始开发….</p><p>•2006年 OpenCV1.0正式发布（C）</p><p>•2009年 OpenCV2.0正式发布（C++）</p><p>•2012年 社区托管模式（开源）</p><p>•2015年 OpenCV3.0正式发布（完善接口）</p><p>•2018年 OpenCV4.0正式发布</p><p>•2022年4月份，4.5.5版本</p><h2 id="2-OpenCV模块架构"><a href="#2-OpenCV模块架构" class="headerlink" title="2.OpenCV模块架构"></a>2.OpenCV模块架构</h2><p><img src="/Users/yang/Library/Application Support/typora-user-images/image-20221212200329994.png" alt="image-20221212200329994"></p><h2 id="3-OpenCV安装与支持"><a href="#3-OpenCV安装与支持" class="headerlink" title="3.OpenCV安装与支持"></a>3.OpenCV安装与支持</h2><p>•Python SDK安装，推荐3.6.5</p><p>•OpenCV-Python安装 <code>pip install opencv-python==4.5.4.60</code></p><p>​    (支持镜像安装<code>-i https://pypi.tuna.tsinghua.edu.cn/simple</code>)</p><p>•检查 <code>pip list</code></p><p><code>python</code></p><p><code>Import cv2 as cv</code></p><p><code>cv.__version__</code></p><h2 id="4-Intel-Devcloud-codelab使用"><a href="#4-Intel-Devcloud-codelab使用" class="headerlink" title="4.Intel Devcloud codelab使用"></a>4.Intel Devcloud codelab使用</h2><p>网址：devcloud.intel.com/edge（注册登录）</p><p>学习——教程——OpenCV Tutorial</p><h1 id="四、图像读取与显示"><a href="#四、图像读取与显示" class="headerlink" title="四、图像读取与显示"></a>四、图像读取与显示</h1><p>计算机通过数值识别灰度或彩色图像</p><h2 id="图像读取与显示"><a href="#图像读取与显示" class="headerlink" title="图像读取与显示"></a><strong>图像读取与显示</strong></h2><p>•<code>import cv2 as cv</code> – 导入OpenCV支持</p><p>•<code>import numpy as np</code> – 导入Numpy支持</p><p>•imread函数，读取图像</p><p>•imshow函数, 显示图像</p><p>•加载图像的通道顺序</p><p>•<code>cv.imread(filename[,flags])</code></p><p>​    -filename 表示文件路径</p><p>​    -[]内的参数表示可选，可以不填</p><p>•<code>cv.imshow( winname, mat)</code> #BGR</p><p>​    -winname表示窗口标题</p><p>​    -mat 表示图像对象</p><p>•<code>cv.waitKey(0)</code>  #表示一直等待，直到任意一个键盘操作</p><p>•<code>cv.waitKey(1000)</code>  #表示等待1000毫秒即1秒</p><p>•<code>cv.destroyAllWindows()</code>  #关闭窗口并取消分配任何相关的内存使用。对于一个简单的程序，实际上不必调用这些函数，因为退出时操作系统会自动关闭应用程序的所有资源和窗口</p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;一、认识计算机视觉&quot;&gt;&lt;a href=&quot;#一、认识计算机视觉&quot; class=&quot;headerlink&quot; title=&quot;一、认识计算机视觉&quot;&gt;&lt;/a&gt;一、认识计算机视觉&lt;/h1&gt;&lt;h2 id=&quot;1-发展历史&quot;&gt;&lt;a href=&quot;#1-发展历史&quot; class=&quot;head</summary>
      
    
    
    
    
    <category term="CV" scheme="https://yang-makabaka.github.io/tags/CV/"/>
    
  </entry>
  
  <entry>
    <title>Ubuntu22.04前期优化</title>
    <link href="https://yang-makabaka.github.io/posts/dac10dd2.html"/>
    <id>https://yang-makabaka.github.io/posts/dac10dd2.html</id>
    <published>2022-12-06T15:19:12.000Z</published>
    <updated>2024-02-17T12:48:41.661Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Ubuntu前期优化"><a href="#Ubuntu前期优化" class="headerlink" title="Ubuntu前期优化"></a>Ubuntu前期优化</h1><h2 id="一、系统设置"><a href="#一、系统设置" class="headerlink" title="一、系统设置"></a>一、系统设置</h2><h4 id="1-换源"><a href="#1-换源" class="headerlink" title="1.换源"></a>1.换源</h4><p>“软件和更新”——“Ubuntu软件”——“下载自”——改为“位于 中国的服务器”</p><p><img src="https://s3.bmp.ovh/imgs/2022/12/23/ec0c52678b306689.png" alt=""></p><h4 id="2-显卡驱动"><a href="#2-显卡驱动" class="headerlink" title="2.显卡驱动"></a>2.显卡驱动</h4><p>待更新。。。</p><h4 id="3-界面优化"><a href="#3-界面优化" class="headerlink" title="3.界面优化"></a>3.界面优化</h4><p>自己随心设置</p><p>“设置”——“外观”——“桌面图标“：关闭“显示个人文件夹</p><p>“设置”——“外观”——“Dock”:打开自动隐藏Dock,关闭面板模式，屏幕上的位置（底部）</p><p>“设置”——“鼠标和触摸板”——“触摸板”：打开自然滚动，调整触摸板速度，打开双指滚动</p><p><img src="https://s3.bmp.ovh/imgs/2022/12/23/e0e1f98b2962a84d.png" alt=""></p><h4 id="4-关闭更新"><a href="#4-关闭更新" class="headerlink" title="4.关闭更新"></a>4.关闭更新</h4><p>“软件和更新”——“更新”：根据自己的想法调整</p><p><img src="https://s3.bmp.ovh/imgs/2022/12/23/b92ce79306df42e0.png" alt=""></p><h4 id="5-关闭Dock栏显示其他分区磁盘（若出现此情况）"><a href="#5-关闭Dock栏显示其他分区磁盘（若出现此情况）" class="headerlink" title="5.关闭Dock栏显示其他分区磁盘（若出现此情况）"></a>5.关闭Dock栏显示其他分区磁盘（若出现此情况）</h4><p>“磁盘”——选择对应磁盘及分区——“其他分区选项”——“编辑挂载选项”——关闭“用户会话默认值”——取消勾选“系统启动时挂载”、“显示用户界面”——确定即可</p><p><img src="https://s3.bmp.ovh/imgs/2022/12/23/b0574b884c8ea719.png" alt=""></p><h4 id="6-设置区域与语言"><a href="#6-设置区域与语言" class="headerlink" title="6.设置区域与语言"></a>6.设置区域与语言</h4><p>“设置”——“区域与语言”——“管理已安装的语言”——会提示语言支持没有安装完整，点击安装，完成后重启即可。</p><h4 id="7-终端设置"><a href="#7-终端设置" class="headerlink" title="7.终端设置"></a>7.终端设置</h4><p>自己随心设置</p><p>开启终端快捷键：Ctrl + Alt + T</p><p>打开终端——“设置——”“配置文件首选项”——点击配置文件首选项，”内置方案“选择“Linux控制台“，“以亮色显示粗体字”，即时生效。</p><p><img src="https://s3.bmp.ovh/imgs/2022/12/23/f329d103efa8610f.png" alt=""></p><p><img src="https://s3.bmp.ovh/imgs/2022/12/23/3cfe28f1eb0ab101.png" alt=""></p><h4 id="8-修改用户目录为中文"><a href="#8-修改用户目录为中文" class="headerlink" title="8.修改用户目录为中文"></a>8.修改用户目录为中文</h4><p>打开终端输入以下命令：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">export LANG=en_US</span><br><span class="line">xdg-user-dirs-gtk-update</span><br></pre></td></tr></table></figure><p>弹出对话框 ，<strong><em>不要勾选</em></strong>“下次别问我”之类的选项，选择更新名称。</p><p>终端输入：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">export LANG=zh_CN</span><br></pre></td></tr></table></figure><p>关闭终端，重启系统。</p><p>进入系统，系统会提示是否把目录改回中文，勾选“不要再次询问我”，选择<strong><em>保留旧的名称</em></strong></p><h4 id="9-命令优化（选做）"><a href="#9-命令优化（选做）" class="headerlink" title="9.命令优化（选做）"></a>9.命令优化（选做）</h4><h5 id="1、添加open命令"><a href="#1、添加open命令" class="headerlink" title="1、添加open命令"></a>1、添加open命令</h5><h5 id="（1）打开当前目录"><a href="#（1）打开当前目录" class="headerlink" title="（1）打开当前目录"></a>（1）打开当前目录</h5><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#打开var目录</span></span><br><span class="line">nautilus /var</span><br></pre></td></tr></table></figure><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 1. 编辑</span></span><br><span class="line">vim /etc/profile</span><br><span class="line"></span><br><span class="line"><span class="comment"># 2.取别名</span></span><br><span class="line">alias <span class="built_in">open</span>=<span class="string">&quot;nautilus $1&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 3.</span></span><br><span class="line">source /etc/profile</span><br><span class="line"></span><br><span class="line"><span class="comment"># 4.打开文件夹</span></span><br><span class="line"><span class="built_in">open</span> .</span><br></pre></td></tr></table></figure><h5 id="（2）添加命令"><a href="#（2）添加命令" class="headerlink" title="（2）添加命令"></a>（2）添加命令</h5><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">vi ~/.bashrc  # 或者 gedit ~/.bashrc 个人习惯了vi 命令</span><br><span class="line"><span class="meta prompt_">#</span><span class="language-bash">添加如下内容</span></span><br><span class="line">alias open=&quot;nautilus .&quot;</span><br><span class="line"><span class="meta prompt_">#</span><span class="language-bash">使资源生效</span></span><br><span class="line">source ~/.bashrc</span><br></pre></td></tr></table></figure><h5 id="（3）提示：vim操作"><a href="#（3）提示：vim操作" class="headerlink" title="（3）提示：vim操作"></a>（3）提示：vim操作</h5><p>“i”：编辑插入</p><p>“Esc”键：退出编辑</p><p>输入 “:wq”：保存退出</p><h4 id="10-科学上网"><a href="#10-科学上网" class="headerlink" title="10.科学上网"></a>10.科学上网</h4><p>官网：<a href="https://www.clash.la/releases/">https://www.clash.la/releases/</a></p><p>或：<a href="https://archive.org/download/clash_for_windows_pkg">https://archive.org/download/clash_for_windows_pkg</a><br>（按自己系统版本选择）</p><p>将文件解压至/snap，终端或双击打开解压后的文件夹里的“cfw”</p><p>“General”——“Service Mode”——“Manage”——“Install”</p><p>“General”——打开“TUN Mode”</p><p>“General”——打开“start with Linux”（打开后每次开机会自动启动clash）</p><p>“Profiles”——在这添加自己的配置</p><h4 id="11-GNOME-Tweaks-和扩展（推荐，用于美化系统）"><a href="#11-GNOME-Tweaks-和扩展（推荐，用于美化系统）" class="headerlink" title="11.GNOME Tweaks 和扩展（推荐，用于美化系统）"></a>11.GNOME Tweaks 和扩展（推荐，用于美化系统）</h4><p>打开系统的商店，左上角搜索“GNOME”，安装“GNOME Tweaks”,之后在程序坞中找到“优化”打开，即可进行更多系统设置。<br>同时可安装“扩展管理器”，寻找更多扩展插件。</p><h2 id="二、软件安装"><a href="#二、软件安装" class="headerlink" title="二、软件安装"></a>二、软件安装</h2><h4 id="Typora"><a href="#Typora" class="headerlink" title="Typora"></a>Typora</h4><p>（好用的markdown编辑器）<br>链接: <a href="https://pan.baidu.com/s/1atxTuNOmyeCL4cFiMd1BLg?pwd=jxsn">https://pan.baidu.com/s/1atxTuNOmyeCL4cFiMd1BLg?pwd=jxsn</a> 提取码: jxsn</p><h5 id="1-安装"><a href="#1-安装" class="headerlink" title="(1)安装"></a>(1)安装</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#在所下载包的文件夹打开终端</span></span><br><span class="line">tar xzvf Typora-linux-x64.tar.gz </span><br><span class="line"><span class="built_in">cd</span> bin</span><br><span class="line">sudo <span class="built_in">cp</span> -ar Typora-linux-x64 /opt</span><br><span class="line"><span class="built_in">cd</span> /opt/Typora-linux-x64/</span><br><span class="line"><span class="comment">#启动</span></span><br><span class="line">./Typora</span><br></pre></td></tr></table></figure><h5 id="（2）配置"><a href="#（2）配置" class="headerlink" title="（2）配置"></a>（2）配置</h5><p>设置环境变量</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo vim ~/.bashrc</span><br></pre></td></tr></table></figure><p>打开.bashrc配置文件，添加：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">#Typora环境变量</span><br><span class="line"><span class="keyword">export</span> PATH=$PATH:/opt/Typora-linux-x64</span><br></pre></td></tr></table></figure><p>source以下，让配置生效</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">source</span> ~/.bashrc</span><br></pre></td></tr></table></figure><p>之后可终端输入“Typora”直接打开。</p><h5 id="（3）添加桌面面标"><a href="#（3）添加桌面面标" class="headerlink" title="（3）添加桌面面标"></a>（3）添加桌面面标</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> /usr/share/applications</span><br><span class="line">sudo vim typora.desktop</span><br></pre></td></tr></table></figure><p>添加以下内容，后重启系统：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">[Desktop Entry]</span><br><span class="line">Name=Typora</span><br><span class="line">Comment=Typora</span><br><span class="line">Exec=/opt/Typora-linux-x64/Typora</span><br><span class="line">Icon=/opt/Typora-linux-x64/resources/app/asserts/icon/icon_256x256.png</span><br><span class="line">Terminal=false</span><br><span class="line">Type=Application</span><br><span class="line">Categories=Developer;</span><br></pre></td></tr></table></figure><p>打开终端输入下面内容：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">gedit ~/.config/mimeapps.list</span><br></pre></td></tr></table></figure><p>添加  <code>text/markdown=typora.desktop;</code></p><p><img src="https://s3.bmp.ovh/imgs/2022/12/23/1e796a3cac96307d.png" alt=""></p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Ubuntu前期优化&quot;&gt;&lt;a href=&quot;#Ubuntu前期优化&quot; class=&quot;headerlink&quot; title=&quot;Ubuntu前期优化&quot;&gt;&lt;/a&gt;Ubuntu前期优化&lt;/h1&gt;&lt;h2 id=&quot;一、系统设置&quot;&gt;&lt;a href=&quot;#一、系统设置&quot; class=&quot;</summary>
      
    
    
    
    
    <category term="Linux" scheme="https://yang-makabaka.github.io/tags/Linux/"/>
    
  </entry>
  
  <entry>
    <title>Ubuntu22.04安装</title>
    <link href="https://yang-makabaka.github.io/posts/549a771b.html"/>
    <id>https://yang-makabaka.github.io/posts/549a771b.html</id>
    <published>2022-12-06T15:11:57.000Z</published>
    <updated>2024-04-20T05:42:05.973Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Ubuntu系统安装"><a href="#Ubuntu系统安装" class="headerlink" title="Ubuntu系统安装"></a>Ubuntu系统安装</h1><p><strong>更新于2024.4.18</strong><br><strong>准备内存至少为4G的U盘，最好8G及以上</strong></p><h2 id="1-准备工作"><a href="#1-准备工作" class="headerlink" title="1.准备工作"></a>1.准备工作</h2><p><strong>（1）下载Ubuntu镜像：<a href="https://cn.ubuntu.com/download/desktop">下载Ubuntu桌面系统 | Ubuntu</a></strong></p><p><strong>（2）提前从存储分出空白区域</strong></p><h4 id="下载工具：Diskgenius，下载后解压运行即可"><a href="#下载工具：Diskgenius，下载后解压运行即可" class="headerlink" title="下载工具：Diskgenius，下载后解压运行即可"></a>下载工具：<a href="https://www.diskgenius.cn/">Diskgenius</a>，下载后解压运行即可</h4><h4 id="参考（下面步骤看不懂时看这个链接）：-Diskgenius教程-https-www-diskgenius-cn-help-partspliting-php）"><a href="#参考（下面步骤看不懂时看这个链接）：-Diskgenius教程-https-www-diskgenius-cn-help-partspliting-php）" class="headerlink" title="参考（下面步骤看不懂时看这个链接）：[Diskgenius教程](https://www.diskgenius.cn/help/partspliting.php）"></a>参考（下面步骤看不懂时看这个链接）：[Diskgenius教程](<a href="https://www.diskgenius.cn/help/partspliting.php）">https://www.diskgenius.cn/help/partspliting.php）</a></h4><h4 id="慎重选择要分出空间的分区，右键该分区，点击『拆分分区』，『分区后部的空间』即为分给ubuntu的空间，同时注意将其设置为『保持空闲』，完成后在界面左上角有类似『保存修改』选项，点击保存。"><a href="#慎重选择要分出空间的分区，右键该分区，点击『拆分分区』，『分区后部的空间』即为分给ubuntu的空间，同时注意将其设置为『保持空闲』，完成后在界面左上角有类似『保存修改』选项，点击保存。" class="headerlink" title="慎重选择要分出空间的分区，右键该分区，点击『拆分分区』，『分区后部的空间』即为分给ubuntu的空间，同时注意将其设置为『保持空闲』，完成后在界面左上角有类似『保存修改』选项，点击保存。"></a>慎重选择要分出空间的分区，右键该分区，点击『拆分分区』，『分区后部的空间』即为分给ubuntu的空间，同时注意将其设置为『保持空闲』，完成后在界面左上角有类似『保存修改』选项，点击保存。</h4><p><strong>（3）（仅windows有BitLocker进行这步操作，且必须）</strong><br>如何知道自己电脑是否开启BitLocker：查看Diskgenius<br>查看是否有链接中第一张图所示（分区有加密字样）：<a href="https://www.diskgenius.cn/help/bitlocker.php">Diskgenius</a></p><p>①可按照链接方法解锁<br>②还可以<a href="https://www.zhihu.com/question/370323833">参考</a></p><p><strong>（4）下载U盘启动盘制作工具：</strong></p><p>​    <strong>官网：<a href="https://rufus.ie/zh/">Rufus - 轻松创建USB启动盘</a></strong></p><p>​    <strong>or网盘链接: <a href="https://pan.baidu.com/s/1BnXpb-07EtqTBXt28gpFHQ?pwd=1234">https://pan.baidu.com/s/1BnXpb-07EtqTBXt28gpFHQ?pwd=1234</a> 提取码: 1234</strong></p><h2 id="2-安装U盘制作"><a href="#2-安装U盘制作" class="headerlink" title="2.安装U盘制作"></a>2.安装U盘制作</h2><h4 id="1）将要制作的-U-盘插入电脑，打开Rufus"><a href="#1）将要制作的-U-盘插入电脑，打开Rufus" class="headerlink" title="1）将要制作的 U 盘插入电脑，打开Rufus"></a>1）将要制作的 U 盘插入电脑，打开Rufus</h4><h4 id="2）在分区方案和目标系统类型选项中选择用于UEFI计算机的GPT分区方案，文件系统-选择-NTFS"><a href="#2）在分区方案和目标系统类型选项中选择用于UEFI计算机的GPT分区方案，文件系统-选择-NTFS" class="headerlink" title="2）在分区方案和目标系统类型选项中选择用于UEFI计算机的GPT分区方案，文件系统 选择 NTFS"></a>2）在<strong>分区方案和目标系统类型</strong>选项中选择<strong>用于UEFI计算机的GPT分区方案</strong>，<strong>文件系统</strong> 选择 <strong>NTFS</strong></h4><h4 id="3）点击光盘图标选择好下载的光盘镜像文件"><a href="#3）点击光盘图标选择好下载的光盘镜像文件" class="headerlink" title="3）点击光盘图标选择好下载的光盘镜像文件"></a>3）点击光盘图标选择好下载的光盘镜像文件</h4><h4 id="4）点击“开始”进行制作，显示“准备就绪”后，关闭Rufus"><a href="#4）点击“开始”进行制作，显示“准备就绪”后，关闭Rufus" class="headerlink" title="4）点击“开始”进行制作，显示“准备就绪”后，关闭Rufus"></a>4）点击“开始”进行制作，显示“准备就绪”后，关闭Rufus</h4><h2 id="3-安装"><a href="#3-安装" class="headerlink" title="3.安装"></a>3.安装</h2><h4 id="1）重启，进BIOS（不同品牌电脑按键不同，可根据电脑型号去网上查找，参考https-zhuanlan-zhihu-com-p-34223088），关闭安全启动（将【Secure-Boot】设置为【Disabled】）。选择U盘（前面制作的启动盘）启动，保存重启。"><a href="#1）重启，进BIOS（不同品牌电脑按键不同，可根据电脑型号去网上查找，参考https-zhuanlan-zhihu-com-p-34223088），关闭安全启动（将【Secure-Boot】设置为【Disabled】）。选择U盘（前面制作的启动盘）启动，保存重启。" class="headerlink" title="1）重启，进BIOS（不同品牌电脑按键不同，可根据电脑型号去网上查找，参考https://zhuanlan.zhihu.com/p/34223088），关闭安全启动（将【Secure Boot】设置为【Disabled】）。选择U盘（前面制作的启动盘）启动，保存重启。"></a>1）重启，进BIOS（不同品牌电脑按键不同，可根据电脑型号去网上查找，参考<a href="https://zhuanlan.zhihu.com/p/34223088），关闭安全启动（将【Secure">https://zhuanlan.zhihu.com/p/34223088），关闭安全启动（将【Secure</a> Boot】设置为【Disabled】）。选择U盘（前面制作的启动盘）启动，保存重启。</h4><h4 id="2）在欢迎页面左右选择「中文（简体）」，再点击右侧的「安装-Ubuntu」按钮。"><a href="#2）在欢迎页面左右选择「中文（简体）」，再点击右侧的「安装-Ubuntu」按钮。" class="headerlink" title="2）在欢迎页面左右选择「中文（简体）」，再点击右侧的「安装 Ubuntu」按钮。"></a>2）在欢迎页面左右选择「<strong>中文（简体）</strong>」，再点击右侧的「<strong>安装 Ubuntu</strong>」按钮。</h4><h4 id="3）选择chinese，最小安装，取消安装时下载更新，取消安装第三方软件（根据自身需要设置）"><a href="#3）选择chinese，最小安装，取消安装时下载更新，取消安装第三方软件（根据自身需要设置）" class="headerlink" title="3）选择chinese，最小安装，取消安装时下载更新，取消安装第三方软件（根据自身需要设置）"></a>3）选择chinese，最小安装，取消安装时下载更新，取消安装第三方软件（根据自身需要设置）</h4><h4 id="4）分区"><a href="#4）分区" class="headerlink" title="4）分区"></a>4）分区</h4><p>推荐双硬盘，在第二块硬盘安装，不会发生引导冲突，更稳定</p><p>若只想保留Ubuntu而删除Windows，选择“清除整个磁盘并安装”即可</p><p>   由于作者磁盘安装有多个系统，因此选择“其他选项”自己分配空间</p><p>   自用方案：按照下表顺序，依次从上面分出的空白区域中创建相应分区，注意千万不要选择错分区，不然数据无法恢复，若不会创建分区可搜索其他博客学习，写的匆忙故不做演示。</p><div class="table-container"><table><thead><tr><th>名称</th><th>EFI分区</th><th>swap交换分区</th><th>剩余空间(挂载到 ‘/‘ )</th></tr></thead><tbody><tr><td>分配空间大小</td><td>500m</td><td>16G（空间不足时可改为256m）</td><td>剩余空间</td></tr><tr><td>类型</td><td>逻辑分区</td><td>主分区</td><td>逻辑分区</td></tr><tr><td>位置</td><td>空间起始位置 固态硬盘</td><td>空间起始位置 固态硬盘</td><td>空间起始位置 固态硬盘</td></tr><tr><td>用于</td><td>EFI系统分区</td><td>交换空间</td><td>Ext4日志文件系统</td></tr></tbody></table></div><p>特别注意：下面『安装启动器位置』的选项选择上面创建的大小为500m的EFI分区对应的分区名称。</p><h4 id="5）之后经过一些设置，安装完成后重启电脑，重启时即可拔掉U盘"><a href="#5）之后经过一些设置，安装完成后重启电脑，重启时即可拔掉U盘" class="headerlink" title="5）之后经过一些设置，安装完成后重启电脑，重启时即可拔掉U盘"></a>5）之后经过一些设置，安装完成后重启电脑，重启时即可拔掉U盘</h4><p>以后电脑的引导会使用grub引导器，开机用上下键选择想进入的系统，默认为ubuntu，进windows需选择windows boot manager项，同时也可设置默认windows启动，可自行查资料解决。</p><h4 id="6-1）（多系统可能出现的问题）Ubuntu引导顶掉原先引导"><a href="#6-1）（多系统可能出现的问题）Ubuntu引导顶掉原先引导" class="headerlink" title="6.1）（多系统可能出现的问题）Ubuntu引导顶掉原先引导"></a>6.1）（多系统可能出现的问题）Ubuntu引导顶掉原先引导</h4><p>​    方法:重新进入BIOS将启动首选项改回</p><h5 id="6-2）-自用，请勿模仿，仅供自己参考：（本人使用OC引导）"><a href="#6-2）-自用，请勿模仿，仅供自己参考：（本人使用OC引导）" class="headerlink" title="6.2） 自用，请勿模仿，仅供自己参考：（本人使用OC引导）"></a>6.2） 自用，请勿模仿，仅供自己参考：（本人使用OC引导）</h5><p>①找到OC引导所在的位置，将\EFI\OC\config.plist备份</p><p>②将磁盘OC引导所在的EFI分区中的ubuntu文件夹移动到安装时分配的EFI分区（建立名为EFI的文件夹，将文件放入其中，原位置的ubuntu文件夹删除）</p><p>③重新替换OC分区的EFI文件夹，后将上面备份的config.plist导入新EFI文件夹（替换即可）</p><p>④重启进BIOS将启动首选项改为OC引导，保存退出重启即可进入系统选择界面</p><h3 id="至此Ubuntu系统安装步骤全部完成"><a href="#至此Ubuntu系统安装步骤全部完成" class="headerlink" title="至此Ubuntu系统安装步骤全部完成"></a>至此Ubuntu系统安装步骤全部完成</h3><h3 id="如果磁盘空间足够的话，建议备份系统，以防崩溃时快速恢复。"><a href="#如果磁盘空间足够的话，建议备份系统，以防崩溃时快速恢复。" class="headerlink" title="如果磁盘空间足够的话，建议备份系统，以防崩溃时快速恢复。"></a>如果磁盘空间足够的话，建议备份系统，以防崩溃时快速恢复。</h3>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;Ubuntu系统安装&quot;&gt;&lt;a href=&quot;#Ubuntu系统安装&quot; class=&quot;headerlink&quot; title=&quot;Ubuntu系统安装&quot;&gt;&lt;/a&gt;Ubuntu系统安装&lt;/h1&gt;&lt;p&gt;&lt;strong&gt;更新于2024.4.18&lt;/strong&gt;&lt;br&gt;&lt;stro</summary>
      
    
    
    
    
    <category term="Linux" scheme="https://yang-makabaka.github.io/tags/Linux/"/>
    
  </entry>
  
  <entry>
    <title>ThoroughPyTorch——5</title>
    <link href="https://yang-makabaka.github.io/posts/4419691d.html"/>
    <id>https://yang-makabaka.github.io/posts/4419691d.html</id>
    <published>2022-11-27T09:57:52.000Z</published>
    <updated>2022-11-27T09:59:34.816Z</updated>
    
    <content type="html"><![CDATA[<h1 id="PyTorch生态与部署"><a href="#PyTorch生态与部署" class="headerlink" title=" PyTorch生态与部署 "></a><center> PyTorch生态与部署 </center></h1><h1 id="1-PyTorch生态简介"><a href="#1-PyTorch生态简介" class="headerlink" title="1.PyTorch生态简介"></a>1.PyTorch生态简介</h1><p><a href="https://datawhalechina.github.io/thorough-pytorch/第八章/index.html">https://datawhalechina.github.io/thorough-pytorch/第八章/index.html</a></p><p>PyTorch的强大很大程度上取决于它的生态。</p><h2 id="（1）torchvision"><a href="#（1）torchvision" class="headerlink" title="（1）torchvision"></a>（1）torchvision</h2><div class="table-container"><table><thead><tr><th><strong>torchvision.datasets *</strong></th><th><strong>包含了一些我们在计算机视觉中常见的数据集</strong></th></tr></thead><tbody><tr><td>torchvision.models *</td><td>提供一些预训练模型</td></tr><tr><td><strong>torchvision.tramsforms *</strong></td><td><strong>用于数据增强和处理</strong></td></tr><tr><td>torchvision.io</td><td>视频、图片和文件的 IO 操作（读取、写入、编解码）</td></tr><tr><td><strong>torchvision.ops</strong></td><td><a href="https://pytorch.org/vision/stable/ops.html"><strong>提供了许多计算机视觉的特定操作</strong></a></td></tr><tr><td>torchvision.utils</td><td><a href="https://pytorch.org/vision/stable/utils.html">提供了一些可视化的方法</a></td></tr></tbody></table></div><h2 id="（2）PyTorchVideo"><a href="#（2）PyTorchVideo" class="headerlink" title="（2）PyTorchVideo"></a>（2）PyTorchVideo</h2><p>PytorchVideo 提供了加速视频理解研究所需的模块化和高效的API。它还支持不同的深度学习视频组件，如视频模型、视频数据集和视频特定转换，最重要的是，PytorchVideo也提供了model zoo，使得人们可以使用各种先进的预训练视频模型及其评判基准。</p><p>基于 PyTorch，高质量model zoo，支持主流数据集及预处理，模块化设计，支持多模态，移动端部署优化</p><h2 id="（3）torchtext"><a href="#（3）torchtext" class="headerlink" title="（3）torchtext"></a>（3）torchtext</h2><ul><li>数据处理工具 torchtext.data.functional、torchtext.data.utils</li><li>数据集 torchtext.data.datasets</li><li>词表工具 torchtext.vocab</li><li>评测指标 torchtext.metrics</li></ul><h4 id="构建数据集"><a href="#构建数据集" class="headerlink" title="构建数据集"></a>构建数据集</h4><ul><li><strong>Field及其使用</strong></li></ul><p>①构建Field</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">tokenize = lambda x: x.split()</span><br><span class="line">TEXT = data.Field(sequential=True, tokenize=tokenize, lower=True, fix_length=200)</span><br><span class="line">LABEL = data.Field(sequential=False, use_vocab=False)</span><br></pre></td></tr></table></figure><p>②进一步构建dataset</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">from torchtext import data</span><br><span class="line">def get_dataset(csv_data, text_field, label_field, test=False):</span><br><span class="line">    fields = [(&quot;id&quot;, None), # we won&#x27;t be needing the id, so we pass in None as the field</span><br><span class="line">                 (&quot;comment_text&quot;, text_field), (&quot;toxic&quot;, label_field)]       </span><br><span class="line">    examples = []</span><br><span class="line"></span><br><span class="line">    if test:</span><br><span class="line">        # 如果为测试集，则不加载label</span><br><span class="line">        for text in tqdm(csv_data[&#x27;comment_text&#x27;]):</span><br><span class="line">            examples.append(data.Example.fromlist([None, text, None], fields))</span><br><span class="line">    else:</span><br><span class="line">        for text, label in tqdm(zip(csv_data[&#x27;comment_text&#x27;], csv_data[&#x27;toxic&#x27;])):</span><br><span class="line">            examples.append(data.Example.fromlist([None, text, label], fields))</span><br><span class="line">    return examples, fields</span><br></pre></td></tr></table></figure><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">train_data = pd.read_csv(&#x27;train_toxic_comments.csv&#x27;)</span><br><span class="line">valid_data = pd.read_csv(&#x27;valid_toxic_comments.csv&#x27;)</span><br><span class="line">test_data = pd.read_csv(&quot;test_toxic_comments.csv&quot;)</span><br><span class="line">TEXT = data.Field(sequential=True, tokenize=tokenize, lower=True)</span><br><span class="line">LABEL = data.Field(sequential=False, use_vocab=False)</span><br><span class="line"></span><br><span class="line"># 得到构建Dataset所需的examples和fields</span><br><span class="line">train_examples, train_fields = get_dataset(train_data, TEXT, LABEL)</span><br><span class="line">valid_examples, valid_fields = get_dataset(valid_data, TEXT, LABEL)</span><br><span class="line">test_examples, test_fields = get_dataset(test_data, TEXT, None, test=True)</span><br><span class="line"># 构建Dataset数据集</span><br><span class="line">train = data.Dataset(train_examples, train_fields)</span><br><span class="line">valid = data.Dataset(valid_examples, valid_fields)</span><br><span class="line">test = data.Dataset(test_examples, test_fields)</span><br><span class="line"></span><br><span class="line"># 检查keys是否正确</span><br><span class="line">print(train[0].__dict__.keys())</span><br><span class="line">print(test[0].__dict__.keys())</span><br><span class="line"># 抽查内容是否正确</span><br><span class="line">print(train[0].comment_text)</span><br></pre></td></tr></table></figure><ul><li><strong>词汇表（vocab）</strong></li></ul><p>构建词语到向量（或数字）的映射关系</p><p>在torchtext中可以使用Field自带的build_vocab函数完成词汇表构建。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">TEXT.build_vocab(train)</span><br></pre></td></tr></table></figure><ul><li><strong>数据迭代器</strong></li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">from torchtext.data import Iterator, BucketIterator</span><br><span class="line"># 若只针对训练集构造迭代器</span><br><span class="line"># train_iter = data.BucketIterator(dataset=train, batch_size=8, shuffle=True, sort_within_batch=False, repeat=False)</span><br><span class="line"></span><br><span class="line"># 同时对训练集和验证集进行迭代器的构建</span><br><span class="line">train_iter, val_iter = BucketIterator.splits(</span><br><span class="line">        (train, valid), # 构建数据集所需的数据集</span><br><span class="line">        batch_sizes=(8, 8),</span><br><span class="line">        device=-1, # 如果使用gpu，此处将-1更换为GPU的编号</span><br><span class="line">        sort_key=lambda x: len(x.comment_text), # the BucketIterator needs to be told what function it should use to group the data.</span><br><span class="line">        sort_within_batch=False</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">test_iter = Iterator(test, batch_size=8, device=-1, sort=False, sort_within_batch=False)</span><br></pre></td></tr></table></figure><h1 id="2-PyTorch模型部署"><a href="#2-PyTorch模型部署" class="headerlink" title="2.PyTorch模型部署"></a>2.PyTorch模型部署</h1><p><a href="https://datawhalechina.github.io/thorough-pytorch/第九章/index.html">https://datawhalechina.github.io/thorough-pytorch/第九章/index.html</a></p><p><img src="https://niuzhikang.oss-cn-chengdu.aliyuncs.com/figures/202208052139305.jpg" alt="微信图片_20220805213616"></p><h2 id="ONNX"><a href="#ONNX" class="headerlink" title="ONNX"></a>ONNX</h2><h4 id="（1）ONNX简介"><a href="#（1）ONNX简介" class="headerlink" title="（1）ONNX简介"></a>（1）ONNX简介</h4><p>①ONNX</p><ul><li>ONNX官网：<a href="https://onnx.ai/">https://onnx.ai/</a></li><li>ONNX GitHub：<a href="https://github.com/onnx/onnx">https://github.com/onnx/onnx</a></li></ul><p>通过定义一组与环境和平台无关的标准格式，使AI模型可以在不同框架和环境下交互使用。</p><p>使用不同框架训练的模型，转化为ONNX格式后，可以很容易的部署在兼容ONNX的运行环境中。</p><p>②ONNX Runtime</p><ul><li>ONNX Runtime官网：<a href="https://www.onnxruntime.ai/">https://www.onnxruntime.ai/</a></li><li>ONNX Runtime GitHub：<a href="https://github.com/microsoft/onnxruntime">https://github.com/microsoft/onnxruntime</a></li></ul><p>跨平台机器学习推理加速器，可直接读取 .onnx 格式的文件。</p><p>③安装</p><p>ONNX和ONNX Runtime的适配关系：<a href="https://github.com/microsoft/onnxruntime/blob/master/docs/Versioning.md">https://github.com/microsoft/onnxruntime/blob/master/docs/Versioning.md</a></p><p>使用GPU进行推理时，需要卸载onnxruntime，再安装onnxruntime-gpu，同时还需考虑ONNX Runtime与CUDA之间的适配关系，<a href="https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html">参考链接</a></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"># 激活虚拟环境</span><br><span class="line">conda activate env_name # env_name换成环境名称</span><br><span class="line"># 安装onnx</span><br><span class="line">pip install onnx </span><br><span class="line"># 安装onnx runtime</span><br><span class="line">pip install onnxruntime # 使用CPU进行推理</span><br><span class="line"># pip install onnxruntime-gpu # 使用GPU进行推理</span><br></pre></td></tr></table></figure><h4 id="（2）模型导出为ONNX"><a href="#（2）模型导出为ONNX" class="headerlink" title="（2）模型导出为ONNX"></a>（2）模型导出为ONNX</h4><p>使用<code>torch.onnx.export()</code>把模型转换成 ONNX 格式的函数</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">import torch.onnx </span><br><span class="line"># 转换的onnx格式的名称，文件后缀需为.onnx</span><br><span class="line">onnx_file_name = &quot;xxxxxx.onnx&quot;</span><br><span class="line"># 我们需要转换的模型，将torch_model设置为自己的模型</span><br><span class="line">model = torch_model</span><br><span class="line"># 加载权重，将model.pth转换为自己的模型权重</span><br><span class="line"># 如果模型的权重是使用多卡训练出来，我们需要去除权重中多的module. 具体操作可以见5.4节</span><br><span class="line">model = model.load_state_dict(torch.load(&quot;model.pth&quot;))</span><br><span class="line"># 导出模型前，必须调用model.eval()或者model.train(False)</span><br><span class="line">model.eval()</span><br><span class="line"># dummy_input就是一个输入的实例，仅提供输入shape、type等信息 </span><br><span class="line">batch_size = 1 # 随机的取值，当设置dynamic_axes后影响不大</span><br><span class="line">dummy_input = torch.randn(batch_size, 1, 224, 224, requires_grad=True) </span><br><span class="line"># 这组输入对应的模型输出</span><br><span class="line">output = model(dummy_input)</span><br><span class="line"># 导出模型（需确保我们的模型处在推理模式）</span><br><span class="line">torch.onnx.export(model,        # 模型的名称</span><br><span class="line">                  dummy_input,   # 一组实例化输入</span><br><span class="line">                  onnx_file_name,   # 文件保存路径/名称</span><br><span class="line">                  export_params=True,        #  如果指定为True或默认, 参数也会被导出. 如果你要导出一个没训练过的就设为 False.</span><br><span class="line">                  opset_version=10,          # ONNX 算子集的版本，当前已更新到15</span><br><span class="line">                  do_constant_folding=True,  # 是否执行常量折叠优化</span><br><span class="line">                  input_names = [&#x27;input&#x27;],   # 输入模型的张量的名称</span><br><span class="line">                  output_names = [&#x27;output&#x27;], # 输出模型的张量的名称</span><br><span class="line">                  # dynamic_axes将batch_size的维度指定为动态，</span><br><span class="line">                  # 后续进行推理的数据可以与导出的dummy_input的batch_size不同</span><br><span class="line">                  dynamic_axes=&#123;&#x27;input&#x27; : &#123;0 : &#x27;batch_size&#x27;&#125;,    </span><br><span class="line">                                &#x27;output&#x27; : &#123;0 : &#x27;batch_size&#x27;&#125;&#125;)</span><br></pre></td></tr></table></figure><h4 id="（3）可用性检查"><a href="#（3）可用性检查" class="headerlink" title="（3）可用性检查"></a>（3）可用性检查</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">import onnx</span><br><span class="line"># 我们可以使用异常处理的方法进行检验</span><br><span class="line">try:</span><br><span class="line">    # 当我们的模型不可用时，将会报出异常</span><br><span class="line">    onnx.checker.check_model(self.onnx_model)</span><br><span class="line">except onnx.checker.ValidationError as e:</span><br><span class="line">    print(&quot;The model is invalid: %s&quot;%e)</span><br><span class="line">else:</span><br><span class="line">    # 模型可用时，将不会报出异常，并会输出“The model is valid!”</span><br><span class="line">    print(&quot;The model is valid!&quot;)</span><br></pre></td></tr></table></figure><h4 id="（4）可视化"><a href="#（4）可视化" class="headerlink" title="（4）可视化"></a>（4）可视化</h4><p><strong>Netron</strong></p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/screenshot.png" alt="img"></p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;PyTorch生态与部署&quot;&gt;&lt;a href=&quot;#PyTorch生态与部署&quot; class=&quot;headerlink&quot; title=&quot; PyTorch生态与部署 &quot;&gt;&lt;/a&gt;&lt;center&gt; PyTorch生态与部署 &lt;/center&gt;&lt;/h1&gt;&lt;h1 id=&quot;1-Py</summary>
      
    
    
    
    
    <category term="笔记" scheme="https://yang-makabaka.github.io/tags/%E7%AC%94%E8%AE%B0/"/>
    
  </entry>
  
  <entry>
    <title>ThoroughPytorch——4</title>
    <link href="https://yang-makabaka.github.io/posts/2b2f8273.html"/>
    <id>https://yang-makabaka.github.io/posts/2b2f8273.html</id>
    <published>2022-11-25T15:29:13.000Z</published>
    <updated>2022-11-27T11:12:31.201Z</updated>
    
    <content type="html"><![CDATA[<h1 id="第七章：PyTorch可视化"><a href="#第七章：PyTorch可视化" class="headerlink" title="第七章：PyTorch可视化"></a>第七章：PyTorch可视化</h1><p><a href="https://datawhalechina.github.io/thorough-pytorch/第七章/index.html">第七章：PyTorch可视化 — 深入浅出PyTorch (datawhalechina.github.io)</a></p><h2 id="7-1-可视化网络结构"><a href="#7-1-可视化网络结构" class="headerlink" title="7.1 可视化网络结构"></a>7.1 可视化网络结构</h2><p>使用torchinfo来可视化网络结构</p><ul><li><strong>torchinfo的安装</strong></li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"># 安装方法一</span><br><span class="line">pip install torchinfo </span><br><span class="line"># 安装方法二</span><br><span class="line">conda install -c conda-forge torchinfo</span><br></pre></td></tr></table></figure><ul><li><strong>torchinfo的使用</strong></li></ul><p>只需使用<code>torchinfo.summary()</code>，</p><p>必需的参数分别是model，input_size[batch_size,channel,h,w]</p><p>更多参数可以参考<a href="https://github.com/TylerYep/torchinfo#documentation">documentation</a></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"># 例</span><br><span class="line">import torchvision.models as models</span><br><span class="line">from torchinfo import summary</span><br><span class="line">resnet18 = models.resnet18() # 实例化模型</span><br><span class="line">summary(resnet18, (1, 3, 224, 224)) # 1：batch_size 3:图片的通道数 224: 图片的高宽</span><br></pre></td></tr></table></figure><p>torchinfo提供了更加详细的信息，包括模块信息（每一层的类型、输出shape和参数量）、模型整体的参数量、模型大小、一次前向或者反向传播需要的内存大小等。</p><h2 id="7-2-CNN可视化"><a href="#7-2-CNN可视化" class="headerlink" title="7.2 CNN可视化"></a>7.2 CNN可视化</h2><h4 id="7-2-1-卷积核可视化"><a href="#7-2-1-卷积核可视化" class="headerlink" title="7.2.1 卷积核可视化"></a>7.2.1 卷积核可视化</h4><p>卷积核在CNN中负责提取特征，可视化卷积核能够帮助人们理解CNN各个层在提取什么样的特征，进而理解模型的工作原理。</p><p>在PyTorch中可视化卷积核也非常方便，核心在于特定层的卷积核即特定层的模型权重，可视化卷积核就等价于可视化对应的权重矩阵。</p><p><strong>首先加载模型，并确定模型的层信息：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">import torch</span><br><span class="line">from torchvision.models import vgg11</span><br><span class="line"></span><br><span class="line">model = vgg11(pretrained=True)</span><br><span class="line">print(dict(model.features.named_children()))</span><br></pre></td></tr></table></figure><p><strong>卷积核对应的应为卷积层（Conv2d），这里以第“3”层为例，可视化对应的参数：</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">conv1 = dict(model.features.named_children())[&#x27;3&#x27;]</span><br><span class="line">kernel_set = conv1.weight.detach()</span><br><span class="line">num = len(conv1.weight.detach())</span><br><span class="line">print(kernel_set.shape)</span><br><span class="line">for i in range(0,num):</span><br><span class="line">    i_kernel = kernel_set[i]</span><br><span class="line">    plt.figure(figsize=(20, 17))</span><br><span class="line">    if (len(i_kernel)) &gt; 1:</span><br><span class="line">        for idx, filer in enumerate(i_kernel):</span><br><span class="line">            plt.subplot(9, 9, idx+1) </span><br><span class="line">            plt.axis(&#x27;off&#x27;)</span><br><span class="line">            plt.imshow(filer[ :, :].detach(),cmap=&#x27;bwr&#x27;)</span><br><span class="line">torch.Size([128, 64, 3, 3])</span><br></pre></td></tr></table></figure><p>由于第“3”层的特征图由64维变为128维，因此共有128*64个卷积核，其中部分卷积核可视化效果如下图所示：</p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/kernel_vis.png" alt="kernel"></p><h4 id="7-2-2-特征图可视化"><a href="#7-2-2-特征图可视化" class="headerlink" title="7.2.2 特征图可视化"></a>7.2.2 特征图可视化</h4><p>输入的原始图像经过每次卷积层得到的数据称为特征图，可视化即查看模型提取到的特征是什么样的。</p><p><strong><em>hook</em></strong>：PyTorch提供的<strong>使得网络在前向传播过程中能够获取到特征图</strong>的一个专用接口。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line">class Hook(object): #定义Hook类</span><br><span class="line">    def __init__(self):</span><br><span class="line">        self.module_name = []</span><br><span class="line">        self.features_in_hook = []</span><br><span class="line">        self.features_out_hook = []</span><br><span class="line"></span><br><span class="line">    def __call__(self,module, fea_in, fea_out):</span><br><span class="line">        print(&quot;hooker working&quot;, self)</span><br><span class="line">        self.module_name.append(module.__class__)</span><br><span class="line">        self.features_in_hook.append(fea_in)</span><br><span class="line">        self.features_out_hook.append(fea_out) #存储当前层的输入和输出</span><br><span class="line">        return None</span><br><span class="line">    </span><br><span class="line"></span><br><span class="line">def plot_feature(model, idx, inputs):</span><br><span class="line">    hh = Hook()</span><br><span class="line">    model.features[idx].register_forward_hook(hh)  #该hook类的对象注册到要进行可视化的网络的某层中</span><br><span class="line">    </span><br><span class="line">    # forward_model(model,False)</span><br><span class="line">    model.eval()</span><br><span class="line">    _ = model(inputs)</span><br><span class="line">    print(hh.module_name)</span><br><span class="line">    print((hh.features_in_hook[0][0].shape))</span><br><span class="line">    print((hh.features_out_hook[0].shape))</span><br><span class="line">    </span><br><span class="line">    out1 = hh.features_out_hook[0]</span><br><span class="line"></span><br><span class="line">    total_ft  = out1.shape[1]</span><br><span class="line">    first_item = out1[0].cpu().clone()    </span><br><span class="line"></span><br><span class="line">    plt.figure(figsize=(20, 17))</span><br><span class="line">    </span><br><span class="line"></span><br><span class="line">    for ftidx in range(total_ft):</span><br><span class="line">        if ftidx &gt; 99:</span><br><span class="line">            break</span><br><span class="line">        ft = first_item[ftidx]</span><br><span class="line">        plt.subplot(10, 10, ftidx+1) </span><br><span class="line">        </span><br><span class="line">        plt.axis(&#x27;off&#x27;)</span><br><span class="line">        #plt.imshow(ft[ :, :].detach(),cmap=&#x27;gray&#x27;)</span><br><span class="line">        plt.imshow(ft[ :, :].detach())</span><br></pre></td></tr></table></figure><h4 id="7-2-3-class-activation-map可视化"><a href="#7-2-3-class-activation-map可视化" class="headerlink" title="7.2.3 class activation map可视化"></a>7.2.3 class activation map可视化</h4><p>class activation map （CAM）的作用是判断哪些变量（可视化场景下为像素点）对模型来说是重要的。</p><p>同时为了判断重要区域的梯度等信息，衍生出了Grad-CAM等诸多变种。</p><p>相较于上两条，CAM能一目了然地确定重要区域，进而进行可解释性分析或模型优化改进。</p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/cam.png" alt="cam"></p><p>实现方法：</p><p><strong>pytorch-grad-cam</strong></p><ul><li>安装</li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install grad-cam</span><br></pre></td></tr></table></figure><ul><li>一个简单的例子</li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">import torch</span><br><span class="line">from torchvision.models import vgg11,resnet18,resnet101,resnext101_32x8d</span><br><span class="line">import matplotlib.pyplot as plt</span><br><span class="line">from PIL import Image</span><br><span class="line">import numpy as np</span><br><span class="line"></span><br><span class="line">model = vgg11(pretrained=True)</span><br><span class="line">img_path = &#x27;./dog.png&#x27;</span><br><span class="line"># resize操作是为了和传入神经网络训练图片大小一致</span><br><span class="line">img = Image.open(img_path).resize((224,224))</span><br><span class="line"># 需要将原始图片转为np.float32格式并且在0-1之间 </span><br><span class="line">rgb_img = np.float32(img)/255</span><br><span class="line">plt.imshow(img)</span><br></pre></td></tr></table></figure><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/dog.png" alt="dog"></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">from pytorch_grad_cam import GradCAM,ScoreCAM,GradCAMPlusPlus,AblationCAM,XGradCAM,EigenCAM,FullGrad</span><br><span class="line">from pytorch_grad_cam.utils.model_targets import ClassifierOutputTarget</span><br><span class="line">from pytorch_grad_cam.utils.image import show_cam_on_image</span><br><span class="line"></span><br><span class="line">target_layers = [model.features[-1]]</span><br><span class="line"># 选取合适的类激活图，但是ScoreCAM和AblationCAM需要batch_size</span><br><span class="line">cam = GradCAM(model=model,target_layers=target_layers)</span><br><span class="line">targets = [ClassifierOutputTarget(preds)]   </span><br><span class="line"># 上方preds需要设定，比如ImageNet有1000类，这里可以设为200</span><br><span class="line">grayscale_cam = cam(input_tensor=img_tensor, targets=targets)</span><br><span class="line">grayscale_cam = grayscale_cam[0, :]</span><br><span class="line">cam_img = show_cam_on_image(rgb_img, grayscale_cam, use_rgb=True)</span><br><span class="line">print(type(cam_img))</span><br><span class="line">Image.fromarray(cam_img)</span><br></pre></td></tr></table></figure><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/cam_dog.png" alt="grad_cam"></p><h4 id="7-2-4-FlashTorch快速可视化"><a href="#7-2-4-FlashTorch快速可视化" class="headerlink" title="7.2.4 FlashTorch快速可视化"></a>7.2.4 FlashTorch快速可视化</h4><p>对环境有要求：<a href="https://github.com/MisaOgura/flashtorch/issues/39">https://github.com/MisaOgura/flashtorch/issues/39</a></p><ul><li>安装</li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install flashtorch</span><br></pre></td></tr></table></figure><ul><li>可视化梯度</li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"># Download example images</span><br><span class="line"># !mkdir -p images</span><br><span class="line"># !wget -nv \</span><br><span class="line">#    https://github.com/MisaOgura/flashtorch/raw/master/examples/images/great_grey_owl.jpg \</span><br><span class="line">#    https://github.com/MisaOgura/flashtorch/raw/master/examples/images/peacock.jpg   \</span><br><span class="line">#    https://github.com/MisaOgura/flashtorch/raw/master/examples/images/toucan.jpg    \</span><br><span class="line">#    -P /content/images</span><br><span class="line"></span><br><span class="line">import matplotlib.pyplot as plt</span><br><span class="line">import torchvision.models as models</span><br><span class="line">from flashtorch.utils import apply_transforms, load_image</span><br><span class="line">from flashtorch.saliency import Backprop</span><br><span class="line"></span><br><span class="line">model = models.alexnet(pretrained=True)</span><br><span class="line">backprop = Backprop(model)</span><br><span class="line"></span><br><span class="line">image = load_image(&#x27;/content/images/great_grey_owl.jpg&#x27;)</span><br><span class="line">owl = apply_transforms(image)</span><br><span class="line"></span><br><span class="line">target_class = 24</span><br><span class="line">backprop.visualize(owl, target_class, guided=True, use_gpu=True)</span><br></pre></td></tr></table></figure><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/ft_gradient.png" alt="ft-gradient"></p><ul><li>可视化卷积核</li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">import torchvision.models as models</span><br><span class="line">from flashtorch.activmax import GradientAscent</span><br><span class="line"></span><br><span class="line">model = models.vgg16(pretrained=True)</span><br><span class="line">g_ascent = GradientAscent(model.features)</span><br><span class="line"></span><br><span class="line"># specify layer and filter info</span><br><span class="line">conv5_1 = model.features[24]</span><br><span class="line">conv5_1_filters = [45, 271, 363, 489]</span><br><span class="line"></span><br><span class="line">g_ascent.visualize(conv5_1, conv5_1_filters, title=&quot;VGG16: conv5_1&quot;)</span><br></pre></td></tr></table></figure><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/ft_activate.png" alt="ft-activate"></p><h2 id="7-3-使用TensorBoard可视化训练过程"><a href="#7-3-使用TensorBoard可视化训练过程" class="headerlink" title="7.3 使用TensorBoard可视化训练过程"></a>7.3 使用TensorBoard可视化训练过程</h2><p>可视化你所想可视化的所有内容。</p><h4 id="7-3-1安装"><a href="#7-3-1安装" class="headerlink" title="7.3.1安装"></a>7.3.1安装</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install tensorboardX</span><br></pre></td></tr></table></figure><h4 id="7-3-2可视化的基本逻辑"><a href="#7-3-2可视化的基本逻辑" class="headerlink" title="7.3.2可视化的基本逻辑"></a>7.3.2可视化的基本逻辑</h4><p>TensorBoard会将模型每一层的数据保存在指定位置 并以网页的形式可视化。</p><h4 id="7-3-3-配置与启动"><a href="#7-3-3-配置与启动" class="headerlink" title="7.3.3 配置与启动"></a>7.3.3 配置与启动</h4><p>①首先指定一个文件夹供TensorBoard保存记录下来的数据，然后调用tensorboard中的SummaryWriter。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">from tensorboardX import SummaryWriter</span><br><span class="line"></span><br><span class="line">writer = SummaryWriter(&#x27;./指定位置&#x27;) #可手动往文件夹里添加数据，也可以提取到其他机器</span><br></pre></td></tr></table></figure><p>※如果使用PyTorch自带的tensorboard，则采用如下方式import：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">from torch.utils.tensorboard import SummaryWriter</span><br></pre></td></tr></table></figure><p>②启动tensorboard</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">tensorboard --logdir=/path/to/logs/ --port=xxxx</span><br><span class="line">#“path/to/logs/&quot;是指定的保存tensorboard记录结果的文件路径</span><br><span class="line">#port是外部访问TensorBoard的端口号，可以通过访问ip:port访问tensorboard</span><br></pre></td></tr></table></figure><h4 id="7-3-4-模型结构可视化"><a href="#7-3-4-模型结构可视化" class="headerlink" title="7.3.4 模型结构可视化"></a>7.3.4 模型结构可视化</h4><p>首先定义模型：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line">import torch.nn as nn</span><br><span class="line"></span><br><span class="line">class Net(nn.Module):</span><br><span class="line">    def __init__(self):</span><br><span class="line">        super(Net, self).__init__()</span><br><span class="line">        self.conv1 = nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3)</span><br><span class="line">        self.pool = nn.MaxPool2d(kernel_size = 2,stride = 2)</span><br><span class="line">        self.conv2 = nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5)</span><br><span class="line">        self.adaptive_pool = nn.AdaptiveMaxPool2d((1,1))</span><br><span class="line">        self.flatten = nn.Flatten()</span><br><span class="line">        self.linear1 = nn.Linear(64,32)</span><br><span class="line">        self.relu = nn.ReLU()</span><br><span class="line">        self.linear2 = nn.Linear(32,1)</span><br><span class="line">        self.sigmoid = nn.Sigmoid()</span><br><span class="line"></span><br><span class="line">    def forward(self,x):</span><br><span class="line">        x = self.conv1(x)</span><br><span class="line">        x = self.pool(x)</span><br><span class="line">        x = self.conv2(x)</span><br><span class="line">        x = self.pool(x)</span><br><span class="line">        x = self.adaptive_pool(x)</span><br><span class="line">        x = self.flatten(x)</span><br><span class="line">        x = self.linear1(x)</span><br><span class="line">        x = self.relu(x)</span><br><span class="line">        x = self.linear2(x)</span><br><span class="line">        y = self.sigmoid(x)</span><br><span class="line">        return y</span><br><span class="line"></span><br><span class="line">model = Net()</span><br><span class="line">print(model)</span><br></pre></td></tr></table></figure><p>输出如下：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">Net(</span><br><span class="line">  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))</span><br><span class="line">  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)</span><br><span class="line">  (conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))</span><br><span class="line">  (adaptive_pool): AdaptiveMaxPool2d(output_size=(1, 1))</span><br><span class="line">  (flatten): Flatten(start_dim=1, end_dim=-1)</span><br><span class="line">  (linear1): Linear(in_features=64, out_features=32, bias=True)</span><br><span class="line">  (relu): ReLU()</span><br><span class="line">  (linear2): Linear(in_features=32, out_features=1, bias=True)</span><br><span class="line">  (sigmoid): Sigmoid()</span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>可视化模型的思路和7.1中介绍的方法一样，都是给定一个输入数据，前向传播后得到模型的结构，再通过TensorBoard进行可视化，使用add_graph：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">writer.add_graph(model, input_to_model = torch.rand(1, 3, 224, 224))</span><br><span class="line">writer.close()</span><br></pre></td></tr></table></figure><p>展示结果如下（其中框内部分初始会显示为“Net”，需要双击后才会展开）：</p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/tb_model.png" alt="tb_model"></p><h4 id="7-3-5-TensorBoard图像可视化"><a href="#7-3-5-TensorBoard图像可视化" class="headerlink" title="7.3.5 TensorBoard图像可视化"></a>7.3.5 TensorBoard图像可视化</h4><ul><li>对于单张图片的显示使用add_image</li><li>对于多张图片的显示使用add_images</li><li>有时需要使用torchvision.utils.make_grid将多张图片拼成一张图片后，用writer.add_image显示</li></ul><p>以torchvision的CIFAR10数据集为例：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line">import torchvision</span><br><span class="line">from torchvision import datasets, transforms</span><br><span class="line">from torch.utils.data import DataLoader</span><br><span class="line"></span><br><span class="line">transform_train = transforms.Compose(</span><br><span class="line">    [transforms.ToTensor()])</span><br><span class="line">transform_test = transforms.Compose(</span><br><span class="line">    [transforms.ToTensor()])</span><br><span class="line"></span><br><span class="line">train_data = datasets.CIFAR10(&quot;.&quot;, train=True, download=True, transform=transform_train)</span><br><span class="line">test_data = datasets.CIFAR10(&quot;.&quot;, train=False, download=True, transform=transform_test)</span><br><span class="line">train_loader = DataLoader(train_data, batch_size=64, shuffle=True)</span><br><span class="line">test_loader = DataLoader(test_data, batch_size=64)</span><br><span class="line"></span><br><span class="line">images, labels = next(iter(train_loader))</span><br><span class="line"> </span><br><span class="line">#依次进行以下三组可视化</span><br><span class="line"># 仅查看一张图片</span><br><span class="line">writer = SummaryWriter(&#x27;./pytorch_tb&#x27;)</span><br><span class="line">writer.add_image(&#x27;images[0]&#x27;, images[0])</span><br><span class="line">writer.close()</span><br><span class="line"> </span><br><span class="line"># 将多张图片拼接成一张图片，中间用黑色网格分割</span><br><span class="line"># create grid of images</span><br><span class="line">writer = SummaryWriter(&#x27;./pytorch_tb&#x27;)</span><br><span class="line">img_grid = torchvision.utils.make_grid(images)</span><br><span class="line">writer.add_image(&#x27;image_grid&#x27;, img_grid)</span><br><span class="line">writer.close()</span><br><span class="line"> </span><br><span class="line"># 将多张图片直接写入</span><br><span class="line">writer = SummaryWriter(&#x27;./pytorch_tb&#x27;)</span><br><span class="line">writer.add_images(&quot;images&quot;,images,global_step = 0)</span><br><span class="line">writer.close()</span><br></pre></td></tr></table></figure><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/tb_image.png" alt="tb_image"></p><p><img src="img_/pytorch1.png" alt="image-20221125220114712"></p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/tb_images.png" alt="tb_images"></p><h4 id="7-3-6-TensorBoard连续变量可视化"><a href="#7-3-6-TensorBoard连续变量可视化" class="headerlink" title="7.3.6 TensorBoard连续变量可视化"></a>7.3.6 TensorBoard连续变量可视化</h4><p>适合损失函数的可视化，可以更加直观地了解模型的训练情况，从而确定最佳的checkpoint。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">writer = SummaryWriter(&#x27;./pytorch_tb&#x27;)</span><br><span class="line">for i in range(500):</span><br><span class="line">    x = i</span><br><span class="line">    y = x**2</span><br><span class="line">    writer.add_scalar(&quot;x&quot;, x, i) #日志中记录x在第step i 的值</span><br><span class="line">    writer.add_scalar(&quot;y&quot;, y, i) #日志中记录y在第step i 的值</span><br><span class="line">writer.close()</span><br></pre></td></tr></table></figure><p>可视化结果如下：</p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/tb_scalar.png" alt="tb_scalar"></p><p>如果想在同一张图中显示多个曲线，则需要分别建立存放子路径（使用SummaryWriter指定路径即可自动创建，但需要在tensorboard运行目录下），同时在add_scalar中修改曲线的标签使其一致即可：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">writer1 = SummaryWriter(&#x27;./pytorch_tb/x&#x27;)</span><br><span class="line">writer2 = SummaryWriter(&#x27;./pytorch_tb/y&#x27;)</span><br><span class="line">for i in range(500):</span><br><span class="line">    x = i</span><br><span class="line">    y = x*2</span><br><span class="line">    writer1.add_scalar(&quot;same&quot;, x, i) #日志中记录x在第step i 的值</span><br><span class="line">    writer2.add_scalar(&quot;same&quot;, y, i) #日志中记录y在第step i 的值</span><br><span class="line">writer1.close()</span><br><span class="line">writer2.close()</span><br></pre></td></tr></table></figure><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/tb_twolines.png" alt="tb_scalar_twolines"></p><h4 id="7-3-7-TensorBoard参数分布可视化"><a href="#7-3-7-TensorBoard参数分布可视化" class="headerlink" title="7.3.7 TensorBoard参数分布可视化"></a>7.3.7 TensorBoard参数分布可视化</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">#举例</span><br><span class="line">import torch</span><br><span class="line">import numpy as np</span><br><span class="line"></span><br><span class="line"># 创建正态分布的张量模拟参数矩阵</span><br><span class="line">def norm(mean, std):</span><br><span class="line">    t = std * torch.randn((100, 20)) + mean</span><br><span class="line">    return t</span><br><span class="line"> </span><br><span class="line">writer = SummaryWriter(&#x27;./pytorch_tb/&#x27;)</span><br><span class="line">for step, mean in enumerate(range(-10, 10, 1)):</span><br><span class="line">    w = norm(mean, 1)</span><br><span class="line">    writer.add_histogram(&quot;w&quot;, w, step)</span><br><span class="line">    writer.flush()</span><br><span class="line">writer.close()</span><br></pre></td></tr></table></figure><h4 id="7-3-8-服务器端使用TensorBoard"><a href="#7-3-8-服务器端使用TensorBoard" class="headerlink" title="7.3.8 服务器端使用TensorBoard"></a>7.3.8 服务器端使用TensorBoard</h4><p><strong>（1）MAC端</strong></p><p><strong>打开终端，输入的命令依次如下：</strong></p><p>打开tensorflow的运行环境:source activate tensorflow<br>进入log的目录文件夹：cd desktop/tensorflow/<br>输入tensorboard命令：tensorboard —logdir=”log”</p><p><strong>在浏览器中输入网址：http:localhost:6006</strong></p><p><strong>（2）MobaXterm</strong></p><ol><li>在MobaXterm点击Tunneling</li><li>选择New SSH tunnel，我们会出现以下界面。</li></ol><p><img src="img_/pytorch2.png" alt="image-20221125225501168"></p><p><img src="img_/pytorch3.png" alt="image-20221125230142013"></p><ol><li>对新建的SSH通道做以下设置，第一栏我们选择<code>Local port forwarding</code>，<code>&lt;Remote Server&gt;</code>我们填写<strong>localhost</strong>，<code>&lt; Remote port&gt;</code>填写6006，tensorboard默认会在6006端口进行显示，我们也可以根据 <strong>tensorboard —logdir=/path/to/logs/ —port=xxxx</strong>的命令中的port进行修改，<code>&lt; SSH server&gt;</code> 填写我们连接服务器的ip地址，<code>&lt;SSH login&gt;</code>填写我们连接的服务器的用户名，<code>&lt;SSH port&gt;</code>填写端口号（通常为22），<code>&lt; forwarded port&gt;</code>填写的是本地的一个端口号，以便我们后面可以对其进行访问。</li><li>设定好之后，点击Save，然后Start。在启动tensorboard，这样我们就可以在本地的浏览器输入<code>http://localhost:6006/</code>对其进行访问了</li></ol><p><strong>（3）Xshell</strong></p><ol><li>Xshell的连接方法与MobaXterm的连接方式本质上是一样的，具体操作如下：</li><li>连接上服务器后，打开当前会话属性，会出现下图，我们选择隧道，点击添加 <img src="https://datawhalechina.github.io/thorough-pytorch/_images/xshell_ui.png" alt="xhell_ui"></li><li>按照下方图进行选择，其中目标主机代表的是服务器，源主机代表的是本地，端口的选择根据实际情况而定。 <img src="https://datawhalechina.github.io/thorough-pytorch/_images/xshell_set.png" alt="xhell_set"></li><li>启动tensorboard，在本地127.0.0.1:6006 或者 localhost:6006进行访问。</li></ol><p><strong>（4）SSL</strong></p><p>该方法是将服务器的6006端口重定向到自己机器上来，我们可以在本地的终端里输入以下代码：其中16006代表映射到本地的端口，6006代表的是服务器上的端口。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ssh -L 16006:127.0.0.1:6006 username@remote_server_ip</span><br></pre></td></tr></table></figure><p>在服务上使用默认的6006端口正常启动tensorboard</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">tensorboard --logdir=xxx --port=6006</span><br></pre></td></tr></table></figure><p>在本地的浏览器输入地址</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">127.0.0.1:16006 或者 localhost:16006</span><br></pre></td></tr></table></figure>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;第七章：PyTorch可视化&quot;&gt;&lt;a href=&quot;#第七章：PyTorch可视化&quot; class=&quot;headerlink&quot; title=&quot;第七章：PyTorch可视化&quot;&gt;&lt;/a&gt;第七章：PyTorch可视化&lt;/h1&gt;&lt;p&gt;&lt;a href=&quot;https://dataw</summary>
      
    
    
    
    
    <category term="笔记" scheme="https://yang-makabaka.github.io/tags/%E7%AC%94%E8%AE%B0/"/>
    
  </entry>
  
  <entry>
    <title>ThoroughPytorch——3</title>
    <link href="https://yang-makabaka.github.io/posts/b54b17d0.html"/>
    <id>https://yang-makabaka.github.io/posts/b54b17d0.html</id>
    <published>2022-11-23T15:30:08.000Z</published>
    <updated>2022-11-23T15:32:20.621Z</updated>
    
    <content type="html"><![CDATA[<h1 id="第六章：PyTorch进阶训练技巧"><a href="#第六章：PyTorch进阶训练技巧" class="headerlink" title="第六章：PyTorch进阶训练技巧"></a>第六章：PyTorch进阶训练技巧</h1><p>DataWhale在线文档：<a href="https://datawhalechina.github.io/thorough-pytorch/第六章/index.html">https://datawhalechina.github.io/thorough-pytorch/第六章/index.html</a></p><h2 id="6-1-自定义损失函数"><a href="#6-1-自定义损失函数" class="headerlink" title="6.1 自定义损失函数"></a>6.1 自定义损失函数</h2><p>在科学研究中，我们往往会提出全新的损失函数来提升模型的表现，此时我们需要自己设计损失函数。</p><h4 id="（1）以函数方式定义"><a href="#（1）以函数方式定义" class="headerlink" title="（1）以函数方式定义"></a>（1）以函数方式定义</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">def my_loss(output, target):</span><br><span class="line">    loss = torch.mean((output - target)**2)</span><br><span class="line">    return loss</span><br></pre></td></tr></table></figure><h4 id="（2）以类方式定义"><a href="#（2）以类方式定义" class="headerlink" title="（2）以类方式定义"></a>（2）以类方式定义</h4><p>在以类方式定义损失函数时，我们如果看每一个损失函数的继承关系我们就可以发现<code>Loss</code>函数部分继承自<code>_loss</code>, 部分继承自<code>_WeightedLoss</code>, 而<code>_WeightedLoss</code>继承自<code>_loss</code>，<code>_loss</code>继承自 <strong>nn.Module</strong>。我们可以将其当作神经网络的一层来对待，同样地，我们的损失函数类就需要继承自<strong>nn.Module</strong>类。</p><p>例：Dice Loss      [ DSC = \frac{2|X∩Y|}{|X|+|Y|} ]</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">class DiceLoss(nn.Module):</span><br><span class="line">    def __init__(self,weight=None,size_average=True):</span><br><span class="line">        super(DiceLoss,self).__init__()</span><br><span class="line">        </span><br><span class="line">    def forward(self,inputs,targets,smooth=1):</span><br><span class="line">        inputs = F.sigmoid(inputs)       </span><br><span class="line">        inputs = inputs.view(-1)</span><br><span class="line">        targets = targets.view(-1)</span><br><span class="line">        intersection = (inputs * targets).sum()                   </span><br><span class="line">        dice = (2.*intersection + smooth)/(inputs.sum() + targets.sum() + smooth)  </span><br><span class="line">        return 1 - dice</span><br><span class="line"></span><br><span class="line"># 使用方法    </span><br><span class="line">criterion = DiceLoss()</span><br><span class="line">loss = criterion(input,targets)</span><br></pre></td></tr></table></figure><h2 id="6-2-动态调整学习率"><a href="#6-2-动态调整学习率" class="headerlink" title="6.2 动态调整学习率"></a>6.2 动态调整学习率</h2><p>我们可以通过一个适当的学习率衰减策略来改善学习率不能满足模型调优需求的情况，提高我们的精度。这种方式称为scheduler。</p><h4 id="（1）使用官方scheduler"><a href="#（1）使用官方scheduler" class="headerlink" title="（1）使用官方scheduler"></a>（1）使用官方scheduler</h4><p>一些封装在torch.optim.lr_scheduler中的调整学习率的方法</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"># 选择一种优化器</span><br><span class="line">optimizer = torch.optim.Adam(...) </span><br><span class="line"># 选择上面提到的一种或多种动态调整学习率的方法</span><br><span class="line">scheduler1 = torch.optim.lr_scheduler.... </span><br><span class="line">scheduler2 = torch.optim.lr_scheduler....</span><br><span class="line">...</span><br><span class="line">schedulern = torch.optim.lr_scheduler....</span><br><span class="line"># 进行训练</span><br><span class="line">for epoch in range(100):</span><br><span class="line">    train(...)</span><br><span class="line">    validate(...)</span><br><span class="line">    optimizer.step()</span><br><span class="line">    # 需要在优化器参数更新之后再动态调整学习率</span><br><span class="line">scheduler1.step() </span><br><span class="line">...</span><br><span class="line">    schedulern.step()  #放在optimizer.step()后面进行使用</span><br></pre></td></tr></table></figure><h4 id="（2）自定义scheduler"><a href="#（2）自定义scheduler" class="headerlink" title="（2）自定义scheduler"></a>（2）自定义scheduler</h4><p>自定义函数<code>adjust_learning_rate</code>来改变<code>param_group</code>中<code>lr</code>的值</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">def adjust_learning_rate(optimizer, epoch): #根据需要改变</span><br><span class="line">    lr = ....</span><br><span class="line">    for param_group in optimizer.param_groups:</span><br><span class="line">        param_group[&#x27;lr&#x27;] = lr</span><br><span class="line">        </span><br><span class="line">def adjust_learning_rate(optimizer,...):</span><br><span class="line">    ...</span><br><span class="line">optimizer = torch.optim.SGD(model.parameters(),lr = args.lr,momentum = 0.9)</span><br><span class="line">for epoch in range(10):</span><br><span class="line">    train(...)</span><br><span class="line">    validate(...)</span><br><span class="line">    adjust_learning_rate(optimizer,epoch)</span><br></pre></td></tr></table></figure><h2 id="6-3-模型微调-torchvision"><a href="#6-3-模型微调-torchvision" class="headerlink" title="6.3 模型微调-torchvision"></a>6.3 模型微调-torchvision</h2><p>为解决数据集不足或花费较大的情况，使用迁移学习方法。</p><p>迁移学习的一大应用场景——预训练模型微调</p><h4 id="6-3-1-模型微调的流程"><a href="#6-3-1-模型微调的流程" class="headerlink" title="6.3.1 模型微调的流程"></a>6.3.1 模型微调的流程</h4><ol><li>在源数据集上预训练一个模型，称源模型。</li><li>创建一个新的目标模型，复制源模型上除输出层外的所有模型设计及其参数。</li><li>为目标模型添加一个输出⼤小为⽬标数据集类别个数的输出层，并随机初始化该层的模型参数。</li><li>在目标数据集上训练目标模型。从头训练输出层，其余层的参数都是基于源模型的参数微调得到的。</li></ol><p>我们假设这些模型参数包含了源数据集上学习到的知识，且这些知识同样适用于目标数据集。我们还假设源模型的输出层跟源数据集的标签紧密相关，因此在目标模型中不予采用。</p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/finetune.png" alt="finetune"></p><h4 id="（2）使用已有模型结构"><a href="#（2）使用已有模型结构" class="headerlink" title="（2）使用已有模型结构"></a>（2）使用已有模型结构</h4><ul><li><p>实例化网络</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">import torchvision.models as models</span><br><span class="line">resnet18 = models.resnet18()</span><br><span class="line"># resnet18 = models.resnet18(pretrained=False)  等价于与上面的表达式</span><br><span class="line">alexnet = models.alexnet()</span><br><span class="line">vgg16 = models.vgg16()</span><br><span class="line">squeezenet = models.squeezenet1_0()</span><br><span class="line">densenet = models.densenet161()</span><br><span class="line">inception = models.inception_v3()</span><br><span class="line">googlenet = models.googlenet()</span><br><span class="line">shufflenet = models.shufflenet_v2_x1_0()</span><br><span class="line">mobilenet_v2 = models.mobilenet_v2()</span><br><span class="line">mobilenet_v3_large = models.mobilenet_v3_large()</span><br><span class="line">mobilenet_v3_small = models.mobilenet_v3_small()</span><br><span class="line">resnext50_32x4d = models.resnext50_32x4d()</span><br><span class="line">wide_resnet50_2 = models.wide_resnet50_2()</span><br><span class="line">mnasnet = models.mnasnet1_0()</span><br></pre></td></tr></table></figure></li><li><p>传递<code>pretrained</code>参数</p></li></ul><p>通过<code>True</code>或者<code>False</code>来决定是否使用预训练好的权重，在默认状态下<code>pretrained = False</code>，意味着我们不使用预训练得到的权重，当<code>pretrained = True</code>，意味着我们将使用在一些数据集上预训练得到的权重。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">import torchvision.models as models</span><br><span class="line">resnet18 = models.resnet18(pretrained=True)</span><br></pre></td></tr></table></figure><p><strong>注意事项：</strong></p><ol><li><p>通常PyTorch模型的扩展为<code>.pt</code>或<code>.pth</code>，程序运行时会检查默认路径是否有下载好的模型权重，权重下载后，下次加载不再需要下载。</p></li><li><p>一般下载较慢，可以直接迅雷或者其他方式去 <a href="https://github.com/pytorch/vision/tree/master/torchvision/models">这里</a> 查看自己的模型里面<code>model_urls</code>，然后手动下载，预训练模型的权重在<code>Linux</code>和<code>Mac</code>的默认下载路径是用户根目录下的<code>.cache</code>文件夹。在<code>Windows</code>下就是<code>C:\Users\&lt;username&gt;\.cache\torch\hub\checkpoint</code>。可以通过使用 <a href="https://pytorch.org/docs/stable/model_zoo.html#torch.utils.model_zoo.load_url"><code>torch.utils.model_zoo.load_url()</code></a>设置权重的下载地址。</p></li><li><p>还可以将权重自己下载放到同文件夹下，然后再将参数加载网络。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">self.model = models.resnet50(pretrained=False)</span><br><span class="line">self.model.load_state_dict(torch.load(&#x27;./model/resnet50-19c8e357.pth&#x27;))</span><br></pre></td></tr></table></figure></li><li><p>如果中途强行停止下载，一定去对应路径下将权重文件删除干净，不然可能会报错。</p></li></ol><h4 id="（3）训练特定层"><a href="#（3）训练特定层" class="headerlink" title="（3）训练特定层"></a>（3）训练特定层</h4><p>提取特征并且只想为新初始化的层计算梯度，其他参数不改变，就需要通过设置<code>requires_grad = False</code>来冻结部分层。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">def set_parameter_requires_grad(model, feature_extracting):</span><br><span class="line">    if feature_extracting:</span><br><span class="line">        for param in model.parameters():</span><br><span class="line">            param.requires_grad = False</span><br></pre></td></tr></table></figure><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">import torchvision.models as models</span><br><span class="line"># 冻结参数的梯度</span><br><span class="line">feature_extract = True</span><br><span class="line">model = models.resnet18(pretrained=True)</span><br><span class="line">set_parameter_requires_grad(model, feature_extract) #引用上面</span><br><span class="line"># 修改模型</span><br><span class="line">num_ftrs = model.fc.in_features</span><br><span class="line">model.fc = nn.Linear(in_features=num_ftrs, out_features=4, bias=True)</span><br></pre></td></tr></table></figure><p>仅改变最后一层的模型参数，不改变特征提取的模型参数；注意我们先冻结模型参数的梯度，再对模型输出部分的全连接层进行修改，这样修改后的全连接层的参数就是可计算梯度的。之后在训练过程中，model仍会进行梯度回传，但是参数更新则只会发生在fc层。通过设定参数的requires_grad属性，我们完成了指定训练模型的特定层的目标，这对实现模型微调非常重要。</p><h2 id="6-4-模型微调-timm"><a href="#6-4-模型微调-timm" class="headerlink" title="6.4 模型微调 - timm"></a>6.4 模型微调 - timm</h2><p>timm是另一个预训练模型库，提供了许多计算机视觉的SOTA模型，可以当作是torchvision的扩充版本，并且里面的模型在准确度上也较高。</p><p>原文：<a href="https://datawhalechina.github.io/thorough-pytorch/第六章/6.3%20模型微调-timm.html">https://datawhalechina.github.io/thorough-pytorch/第六章/6.3%20模型微调-timm.html</a></p><h2 id="6-5半精度训练"><a href="#6-5半精度训练" class="headerlink" title="6.5半精度训练"></a>6.5半精度训练</h2><p>PyTorch默认的浮点数存储方式用的是torch.float32,多数场景其实并不需要这么精确,因此可进行半精度训练（torch.float16）以减少显存使用。</p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/float16.jpg" alt="amp"></p><p>如何设置：</p><ul><li><strong>import autocast</strong></li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">from torch.cuda.amp import autocast</span><br></pre></td></tr></table></figure><ul><li><strong>模型设置</strong></li></ul><p>在模型定义中，使用python的装饰器方法，用autocast装饰模型中的forward函数。关于装饰器的使用，可以参考<a href="https://www.cnblogs.com/jfdwd/p/11253925.html">这里</a>：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">@autocast()   </span><br><span class="line">def forward(self, x):</span><br><span class="line">    ...</span><br><span class="line">    return x</span><br></pre></td></tr></table></figure><ul><li><strong>训练过程</strong></li></ul><p>在训练过程中，只需在将数据输入模型及其之后的部分放入“with autocast():“即可：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">for x in train_loader:</span><br><span class="line">x = x.cuda()</span><br><span class="line">with autocast():</span><br><span class="line">       output = model(x)</span><br><span class="line">       ...</span><br></pre></td></tr></table></figure><h2 id="6-6-数据增强-imgaug"><a href="#6-6-数据增强-imgaug" class="headerlink" title="6.6 数据增强-imgaug"></a>6.6 数据增强-imgaug</h2><p>深度学习需要大量数据，当数据量不够时，可使用数据增强技术，提高训练数据集的大小和质量。</p><h4 id="（1）imgaug"><a href="#（1）imgaug" class="headerlink" title="（1）imgaug"></a>（1）imgaug</h4><p><code>imgaug</code>是计算机视觉任务中常用的一个数据增强的包，相比于<code>torchvision.transforms</code>，它提供了更多的数据增强方法。</p><ol><li>Github地址：<a href="https://github.com/aleju/imgaug">imgaug</a></li><li>Readthedocs：<a href="https://imgaug.readthedocs.io/en/latest/source/examples_basics.html">imgaug</a></li><li>官方提供notebook例程：<a href="https://github.com/aleju/imgaug-doc/tree/master/notebooks">notebook</a></li></ol><p><strong>安装：</strong></p><h4 id="conda"><a href="#conda" class="headerlink" title="conda"></a>conda</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">conda config --add channels conda-forge</span><br><span class="line">conda install imgaug</span><br></pre></td></tr></table></figure><h4 id="pip"><a href="#pip" class="headerlink" title="pip"></a>pip</h4><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">#  install imgaug either via pypi</span><br><span class="line"></span><br><span class="line">pip install imgaug</span><br><span class="line"></span><br><span class="line">#  install the latest version directly from github</span><br><span class="line"></span><br><span class="line">pip install git+https://github.com/aleju/imgaug.git</span><br></pre></td></tr></table></figure><p>具体：<a href="https://datawhalechina.github.io/thorough-pytorch/第六章/6.5%20数据增强-imgaug.html">https://datawhalechina.github.io/thorough-pytorch/第六章/6.5%20数据增强-imgaug.html</a></p><h2 id="6-7-使用argparse进行调参"><a href="#6-7-使用argparse进行调参" class="headerlink" title="6.7 使用argparse进行调参"></a>6.7 使用argparse进行调参</h2><p>解析我们输入的命令行参数再传入模型的超参数中</p><p>命令行输入<code>python file.py --lr 1e-4 --batch_size 32</code>来完成对常见超参数的设置</p><h4 id="（1）使用"><a href="#（1）使用" class="headerlink" title="（1）使用"></a>（1）使用</h4><ul><li>创建<code>ArgumentParser()</code>对象</li><li>调用<code>add_argument()</code>方法添加参数</li><li>使用<code>parse_args()</code>解析参数</li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"># demo.py</span><br><span class="line">import argparse</span><br><span class="line"></span><br><span class="line"># 创建ArgumentParser()对象</span><br><span class="line">parser = argparse.ArgumentParser()</span><br><span class="line"></span><br><span class="line"># 添加参数</span><br><span class="line">parser.add_argument(&#x27;-o&#x27;, &#x27;--output&#x27;, action=&#x27;store_true&#x27;, </span><br><span class="line">    help=&quot;shows output&quot;)</span><br><span class="line"># action = `store_true` 会将output参数记录为True</span><br><span class="line"># type 规定了参数的格式</span><br><span class="line"># default 规定了默认值</span><br><span class="line">parser.add_argument(&#x27;--lr&#x27;, type=float, default=3e-5, help=&#x27;select the learning rate, default=1e-3&#x27;) </span><br><span class="line"></span><br><span class="line">parser.add_argument(&#x27;--batch_size&#x27;, type=int, required=True, help=&#x27;input batch size&#x27;)  </span><br><span class="line"># 使用parse_args()解析函数</span><br><span class="line">args = parser.parse_args()</span><br><span class="line"></span><br><span class="line">if args.output:</span><br><span class="line">    print(&quot;This is some output&quot;)</span><br><span class="line">    print(f&quot;learning rate:&#123;args.lr&#125; &quot;)</span><br></pre></td></tr></table></figure><p>我们在命令行使用<code>python demo.py --lr 3e-4 --batch_size 32</code>，就可以看到以下的输出</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">This is some output</span><br><span class="line">learning rate: 3e-4</span><br></pre></td></tr></table></figure><h4 id="（2）原文作者的方法"><a href="#（2）原文作者的方法" class="headerlink" title="（2）原文作者的方法"></a>（2）原文作者的方法</h4><p>每个人都有着不同的超参数管理方式，在这里我将分享我使用argparse管理超参数的方式，希望可以对大家有一些借鉴意义。通常情况下，为了使代码更加简洁和模块化，我一般会将有关超参数的操作写在<code>config.py</code>，然后在<code>train.py</code>或者其他文件导入就可以。具体的<code>config.py</code>可以参考如下内容。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line">import argparse  </span><br><span class="line">  </span><br><span class="line">def get_options(parser=argparse.ArgumentParser()):  </span><br><span class="line">  </span><br><span class="line">    parser.add_argument(&#x27;--workers&#x27;, type=int, default=0,  </span><br><span class="line">                        help=&#x27;number of data loading workers, you had better put it &#x27;  </span><br><span class="line">                              &#x27;4 times of your gpu&#x27;)  </span><br><span class="line">  </span><br><span class="line">    parser.add_argument(&#x27;--batch_size&#x27;, type=int, default=4, help=&#x27;input batch size, default=64&#x27;)  </span><br><span class="line">  </span><br><span class="line">    parser.add_argument(&#x27;--niter&#x27;, type=int, default=10, help=&#x27;number of epochs to train for, default=10&#x27;)  </span><br><span class="line">  </span><br><span class="line">    parser.add_argument(&#x27;--lr&#x27;, type=float, default=3e-5, help=&#x27;select the learning rate, default=1e-3&#x27;)  </span><br><span class="line">  </span><br><span class="line">    parser.add_argument(&#x27;--seed&#x27;, type=int, default=118, help=&quot;random seed&quot;)  </span><br><span class="line">  </span><br><span class="line">    parser.add_argument(&#x27;--cuda&#x27;, action=&#x27;store_true&#x27;, default=True, help=&#x27;enables cuda&#x27;)  </span><br><span class="line">    parser.add_argument(&#x27;--checkpoint_path&#x27;,type=str,default=&#x27;&#x27;,  </span><br><span class="line">                        help=&#x27;Path to load a previous trained model if not empty (default empty)&#x27;)  </span><br><span class="line">    parser.add_argument(&#x27;--output&#x27;,action=&#x27;store_true&#x27;,default=True,help=&quot;shows output&quot;)  </span><br><span class="line">  </span><br><span class="line">    opt = parser.parse_args()  </span><br><span class="line">  </span><br><span class="line">    if opt.output:  </span><br><span class="line">        print(f&#x27;num_workers: &#123;opt.workers&#125;&#x27;)  </span><br><span class="line">        print(f&#x27;batch_size: &#123;opt.batch_size&#125;&#x27;)  </span><br><span class="line">        print(f&#x27;epochs (niters) : &#123;opt.niter&#125;&#x27;)  </span><br><span class="line">        print(f&#x27;learning rate : &#123;opt.lr&#125;&#x27;)  </span><br><span class="line">        print(f&#x27;manual_seed: &#123;opt.seed&#125;&#x27;)  </span><br><span class="line">        print(f&#x27;cuda enable: &#123;opt.cuda&#125;&#x27;)  </span><br><span class="line">        print(f&#x27;checkpoint_path: &#123;opt.checkpoint_path&#125;&#x27;)  </span><br><span class="line">  </span><br><span class="line">    return opt  </span><br><span class="line">  </span><br><span class="line">if __name__ == &#x27;__main__&#x27;:  </span><br><span class="line">    opt = get_options()</span><br><span class="line">$ python config.py</span><br><span class="line"></span><br><span class="line">num_workers: 0</span><br><span class="line">batch_size: 4</span><br><span class="line">epochs (niters) : 10</span><br><span class="line">learning rate : 3e-05</span><br><span class="line">manual_seed: 118</span><br><span class="line">cuda enable: True</span><br><span class="line">checkpoint_path:</span><br></pre></td></tr></table></figure><p>随后在<code>train.py</code>等其他文件，我们就可以使用下面的这样的结构来调用参数。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"># 导入必要库</span><br><span class="line">...</span><br><span class="line">import config</span><br><span class="line"></span><br><span class="line">opt = config.get_options()</span><br><span class="line"></span><br><span class="line">manual_seed = opt.seed</span><br><span class="line">num_workers = opt.workers</span><br><span class="line">batch_size = opt.batch_size</span><br><span class="line">lr = opt.lr</span><br><span class="line">niters = opt.niters</span><br><span class="line">checkpoint_path = opt.checkpoint_path</span><br><span class="line"></span><br><span class="line"># 随机数的设置，保证复现结果</span><br><span class="line">def set_seed(seed):</span><br><span class="line">    torch.manual_seed(seed)</span><br><span class="line">    torch.cuda.manual_seed_all(seed)</span><br><span class="line">    random.seed(seed)</span><br><span class="line">    np.random.seed(seed)</span><br><span class="line">    torch.backends.cudnn.benchmark = False</span><br><span class="line">    torch.backends.cudnn.deterministic = True</span><br><span class="line"></span><br><span class="line">...</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">if __name__ == &#x27;__main__&#x27;:</span><br><span class="line">set_seed(manual_seed)</span><br><span class="line">for epoch in range(niters):</span><br><span class="line">train(model,lr,batch_size,num_workers,checkpoint_path)</span><br><span class="line">val(model,lr,batch_size,num_workers,checkpoint_path)</span><br></pre></td></tr></table></figure>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;第六章：PyTorch进阶训练技巧&quot;&gt;&lt;a href=&quot;#第六章：PyTorch进阶训练技巧&quot; class=&quot;headerlink&quot; title=&quot;第六章：PyTorch进阶训练技巧&quot;&gt;&lt;/a&gt;第六章：PyTorch进阶训练技巧&lt;/h1&gt;&lt;p&gt;DataWhale在</summary>
      
    
    
    
    
    <category term="笔记" scheme="https://yang-makabaka.github.io/tags/%E7%AC%94%E8%AE%B0/"/>
    
  </entry>
  
  <entry>
    <title>ThoroughPytorch——2</title>
    <link href="https://yang-makabaka.github.io/posts/8987048d.html"/>
    <id>https://yang-makabaka.github.io/posts/8987048d.html</id>
    <published>2022-11-20T16:04:53.000Z</published>
    <updated>2022-11-23T15:30:30.518Z</updated>
    
    <content type="html"><![CDATA[<h1 id="PyTorch模型定义"><a href="#PyTorch模型定义" class="headerlink" title="PyTorch模型定义"></a>PyTorch模型定义</h1><p>DataWhale:<a href="https://datawhalechina.github.io/thorough-pytorch/">https://datawhalechina.github.io/thorough-pytorch/</a>  </p><h2 id="一、模型定义"><a href="#一、模型定义" class="headerlink" title="一、模型定义"></a>一、模型定义</h2><ul><li><code>Module</code> 类是 <code>torch.nn</code> 模块里提供的一个模型构造类 (<code>nn.Module</code>)，是所有神经⽹网络模块的基类，我们可以继承它来定义我们想要的模型；</li><li>PyTorch模型定义应包括两个主要部分：各个部分的初始化（<code>__init__</code>）；数据流向定义（<code>forward</code>）</li></ul><p>基于<code>nn.Module</code>，我们可以通过<code>Sequential</code>，<code>ModuleList</code>和<code>ModuleDict</code>三种方式定义PyTorch模型。</p><h3 id="1-Sequential"><a href="#1-Sequential" class="headerlink" title="1. Sequential"></a>1. Sequential</h3><p>   可更加简单地定义前向计算为简单串联各层的模型。</p><p>   接收子模块或其有序字典作为参数逐一添加作为实例以进行前向计算。</p><p>   灵活性差，不适合加入外部输入。</p>   <figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">import torch.nn as nn</span><br><span class="line">net = nn.Sequential(</span><br><span class="line">        nn.Linear(784, 256),</span><br><span class="line">        nn.ReLU(),</span><br><span class="line">        nn.Linear(256, 10), </span><br><span class="line">        )    #直接排列</span><br></pre></td></tr></table></figure>   <figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">import collections</span><br><span class="line">import torch.nn as nn</span><br><span class="line">net2 = nn.Sequential(collections.OrderedDict([</span><br><span class="line">          (&#x27;fc1&#x27;, nn.Linear(784, 256)),</span><br><span class="line">          (&#x27;relu1&#x27;, nn.ReLU()),</span><br><span class="line">          (&#x27;fc2&#x27;, nn.Linear(256, 10))</span><br><span class="line">          ]))    #使用OrderedDict</span><br></pre></td></tr></table></figure><h3 id="2-ModuleList"><a href="#2-ModuleList" class="headerlink" title="2.ModuleList"></a>2.ModuleList</h3><p>接收一个子模块（或层，需属于<code>nn.Module</code>类）的列表作为输入</p><p>可以进行append和extend操作</p><p>需要经过forward函数指定各个层的先后顺序</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">net = nn.ModuleList([nn.Linear(784, 256), nn.ReLU()])</span><br><span class="line">net.append(nn.Linear(256, 10)) # # 类似List的append操作</span><br><span class="line">print(net[-1])  # 类似List的索引访问</span><br></pre></td></tr></table></figure><h3 id="3-ModuleDict"><a href="#3-ModuleDict" class="headerlink" title="3.ModuleDict"></a>3.ModuleDict</h3><p>和<code>ModuleList</code>类似，只是<code>ModuleDict</code>能够更方便地为神经网络的层添加名称</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">net = nn.ModuleDict(&#123;</span><br><span class="line">    &#x27;linear&#x27;: nn.Linear(784, 256),</span><br><span class="line">    &#x27;act&#x27;: nn.ReLU(),</span><br><span class="line">&#125;)</span><br><span class="line">net[&#x27;output&#x27;] = nn.Linear(256, 10) # 添加</span><br><span class="line">print(net[&#x27;linear&#x27;]) # 访问</span><br><span class="line">print(net.output)</span><br></pre></td></tr></table></figure><h2 id="二、利用模型块快速搭建复杂网络"><a href="#二、利用模型块快速搭建复杂网络" class="headerlink" title="二、利用模型块快速搭建复杂网络"></a>二、利用模型块快速搭建复杂网络</h2><p>以U-Net为例</p><h3 id="1-U-Net"><a href="#1-U-Net" class="headerlink" title="1.U-Net"></a>1.U-Net</h3><p>通过残差连接结构解决了模型学习中的退化问题，使得神经网络的深度能够不断扩展。</p><p>1）梯度消失问题</p><p>我们发现很深的网络层，由于参数初始化一般更靠近0，这样在训练的过程中更新浅层网络的参数时，很容易随着网络的深入而导致梯度消失，浅层的参数无法更新。</p><p>2）网络退化问题</p><p>举个例子，假设已经有了一个最优化的网络结构，是18层。当我们设计网络结构的时候，我们并不知道具体多少层次的网络时最优化的网络结构，假设设计了34层网络结构。那么多出来的16层其实是冗余的，我们希望训练网络的过程中，模型能够自己将这16层冗余层训练为恒等映射，也就是经过这层时的输入与输出完全一样。但是往往模型很难将这16层恒等映射的参数学习正确，那么就不如最优化的18层网络结构的性能，这就是随着网络深度增加，模型会产生退化现象。它不是由过拟合产生的，而是由冗余的网络层学习了不是恒等映射的参数造成的。</p><p><img src="https://datawhalechina.github.io/thorough-pytorch/_images/5.2.1unet.png" alt="unet"></p><p>组成U-Net的模型块主要有如下几个部分：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">import torch</span><br><span class="line">import torch.nn as nn</span><br><span class="line">import torch.nn.functional as F</span><br></pre></td></tr></table></figure><p>1）每个子块内部的两次卷积（Double Convolution）</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">class DoubleConv(nn.Module):</span><br><span class="line">    &quot;&quot;&quot;(convolution =&gt; [BN] =&gt; ReLU) * 2&quot;&quot;&quot;</span><br><span class="line"></span><br><span class="line">    def __init__(self, in_channels, out_channels, mid_channels=None):</span><br><span class="line">        super().__init__()</span><br><span class="line">        if not mid_channels:</span><br><span class="line">            mid_channels = out_channels</span><br><span class="line">        self.double_conv = nn.Sequential(</span><br><span class="line">            nn.Conv2d(in_channels, mid_channels, kernel_size=3, padding=1, bias=False),</span><br><span class="line">            nn.BatchNorm2d(mid_channels),</span><br><span class="line">            nn.ReLU(inplace=True),</span><br><span class="line">            nn.Conv2d(mid_channels, out_channels, kernel_size=3, padding=1, bias=False),</span><br><span class="line">            nn.BatchNorm2d(out_channels),</span><br><span class="line">            nn.ReLU(inplace=True)</span><br><span class="line">        )</span><br><span class="line"></span><br><span class="line">    def forward(self, x):</span><br><span class="line">        return self.double_conv(x)</span><br></pre></td></tr></table></figure><p>2）左侧模型块之间的下采样连接，即最大池化（Max pooling）</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">class Down(nn.Module):</span><br><span class="line">    &quot;&quot;&quot;Downscaling with maxpool then double conv&quot;&quot;&quot;</span><br><span class="line"></span><br><span class="line">    def __init__(self, in_channels, out_channels):</span><br><span class="line">        super().__init__()</span><br><span class="line">        self.maxpool_conv = nn.Sequential(</span><br><span class="line">            nn.MaxPool2d(2),</span><br><span class="line">            DoubleConv(in_channels, out_channels)</span><br><span class="line">        )</span><br><span class="line"></span><br><span class="line">    def forward(self, x):</span><br><span class="line">        return self.maxpool_conv(x)</span><br></pre></td></tr></table></figure><p>3）右侧模型块之间的上采样连接（Up sampling）</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line">class Up(nn.Module):</span><br><span class="line">    &quot;&quot;&quot;Upscaling then double conv&quot;&quot;&quot;</span><br><span class="line"></span><br><span class="line">    def __init__(self, in_channels, out_channels, bilinear=False):</span><br><span class="line">        super().__init__()</span><br><span class="line"></span><br><span class="line">        # if bilinear, use the normal convolutions to reduce the number of channels</span><br><span class="line">        if bilinear:</span><br><span class="line">            self.up = nn.Upsample(scale_factor=2, mode=&#x27;bilinear&#x27;, align_corners=True)</span><br><span class="line">            self.conv = DoubleConv(in_channels, out_channels, in_channels // 2)</span><br><span class="line">        else:</span><br><span class="line">            self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2)</span><br><span class="line">            self.conv = DoubleConv(in_channels, out_channels)</span><br><span class="line"></span><br><span class="line">    def forward(self, x1, x2):</span><br><span class="line">        x1 = self.up(x1)</span><br><span class="line">        # input is CHW</span><br><span class="line">        diffY = x2.size()[2] - x1.size()[2]</span><br><span class="line">        diffX = x2.size()[3] - x1.size()[3]</span><br><span class="line"></span><br><span class="line">        x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,</span><br><span class="line">                        diffY // 2, diffY - diffY // 2])</span><br><span class="line">        # if you have padding issues, see</span><br><span class="line">        # https://github.com/HaiyongJiang/U-Net-Pytorch-Unstructured-Buggy/commit/0e854509c2cea854e247a9c615f175f76fbb2e3a</span><br><span class="line">        # https://github.com/xiaopeng-liao/Pytorch-UNet/commit/8ebac70e633bac59fc22bb5195e513d5832fb3bd</span><br><span class="line">        x = torch.cat([x2, x1], dim=1)</span><br><span class="line">        return self.conv(x)</span><br></pre></td></tr></table></figure><p>4）输出层的处理</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">class OutConv(nn.Module):</span><br><span class="line">    def __init__(self, in_channels, out_channels):</span><br><span class="line">        super(OutConv, self).__init__()</span><br><span class="line">        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)</span><br><span class="line"></span><br><span class="line">    def forward(self, x):</span><br><span class="line">        return self.conv(x)</span><br></pre></td></tr></table></figure><h2 id="三、修改模型"><a href="#三、修改模型" class="headerlink" title="三、修改模型"></a>三、修改模型</h2><p>我们有时需要对模型结构进行必要的修改。</p><h3 id="1-修改模型层"><a href="#1-修改模型层" class="headerlink" title="1.修改模型层"></a>1.修改模型层</h3><p>可修改输出节点数、层数等。</p><h3 id="2-添加外部输入"><a href="#2-添加外部输入" class="headerlink" title="2.添加外部输入"></a>2.添加外部输入</h3><p>将原模型添加输入位置前的部分作为一个整体，同时在forward中定义好原模型不变的部分、添加的输入和后续层之间的连接关系，从而完成模型的修改。</p><h3 id="3-添加额外输出"><a href="#3-添加额外输出" class="headerlink" title="3.添加额外输出"></a>3.添加额外输出</h3><p>输出模型某一中间层的结果，以施加额外的监督，获得更好的中间层结果。基本的思路是修改模型定义中forward函数的return变量。</p><h2 id="四、PyTorch模型保存与读取"><a href="#四、PyTorch模型保存与读取" class="headerlink" title="四、PyTorch模型保存与读取"></a>四、PyTorch模型保存与读取</h2><p>一个PyTorch模型主要包含两个部分：模型结构和权重。</p><p>模型是继承nn.Module的类，权重的数据结构是一个字典（key是层名，value是权重向量）。</p><p>两种形式：存储整个模型（包括结构和权重），和只存储模型权重。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"># 保存整个模型</span><br><span class="line">torch.save(model, save_dir)</span><br><span class="line"># 保存模型权重</span><br><span class="line">torch.save(model.state_dict, save_dir)</span><br></pre></td></tr></table></figure><p>关于单卡和多卡的问题：（DataWhale在线文档）<a href="https://datawhalechina.github.io/thorough-pytorch/第五章/5.4%20PyTorh模型保存与读取.html">https://datawhalechina.github.io/thorough-pytorch/第五章/5.4%20PyTorh模型保存与读取.html</a></p>]]></content>
    
    
      
      
    <summary type="html">&lt;h1 id=&quot;PyTorch模型定义&quot;&gt;&lt;a href=&quot;#PyTorch模型定义&quot; class=&quot;headerlink&quot; title=&quot;PyTorch模型定义&quot;&gt;&lt;/a&gt;PyTorch模型定义&lt;/h1&gt;&lt;p&gt;DataWhale:&lt;a href=&quot;https://datawha</summary>
      
    
    
    
    
    <category term="笔记" scheme="https://yang-makabaka.github.io/tags/%E7%AC%94%E8%AE%B0/"/>
    
  </entry>
  
</feed>
