<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://gaoxingliang.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://gaoxingliang.github.io/" rel="alternate" type="text/html" /><updated>2026-05-22T02:28:31+00:00</updated><id>https://gaoxingliang.github.io/feed.xml</id><title type="html">技术博客</title><subtitle>分享技术实践与开发心得</subtitle><entry><title type="html">quarkus使用/迁移经验</title><link href="https://gaoxingliang.github.io/blog/2026/04/16/quarkus/" rel="alternate" type="text/html" title="quarkus使用/迁移经验" /><published>2026-04-16T02:00:00+00:00</published><updated>2026-04-16T02:00:00+00:00</updated><id>https://gaoxingliang.github.io/blog/2026/04/16/quarkus</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2026/04/16/quarkus/"><![CDATA[<h2 id="quarkus">quarkus</h2>
<p>introduce and make notes about the issues found during using quarkus</p>

<h3 id="build加速">build加速</h3>

<div class="language-properties highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="py">quarkus.native.builder-image</span><span class="p">=</span><span class="s">quay.io/quarkus/ubi9-quarkus-mandrel-builder-image:jdk-21</span>
<span class="py">quarkus.native.container-build</span><span class="p">=</span><span class="s">true</span>
<span class="py">quarkus.native.builder-image.pull</span><span class="p">=</span><span class="s">missing</span>
</code></pre></div></div>

<p>在application.properties 中指定image并且不要每次去pullimage 加快编译。</p>

<h2 id="docker-build-ubi8-vs-ubi9">docker build ubi8 vs ubi9</h2>
<p>从 quarkus 3.19, 默认使用UBI9 作为native镜像， 对 vm里面的cpu有要求，可能会报错：(参考：<a href="https://github.com/quarkusio/quarkus/wiki/Migration-Guide-3.19">url</a>)</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Fatal glibc error: CPU does not support x86-64-v2
</code></pre></div></div>
<p>所以， 使用ubi8：</p>
<div class="language-properties highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="c"># uib9
</span><span class="py">quarkus.native.builder-image</span><span class="p">=</span><span class="s">quay.io/quarkus/ubi9-quarkus-mandrel-builder-image:jdk-21</span>

<span class="c"># ubi8
</span><span class="py">quarkus.native.builder-image</span><span class="p">=</span><span class="s">quay.io/quarkus/ubi-quarkus-mandrel-builder-image:jdk-21</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">Dockerfile</code>修改：</p>
<div class="language-Dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ubi9</span>
<span class="k">FROM</span><span class="s"> registry.access.redhat.com/ubi9/ubi9-minimal:9.6</span>

<span class="c"># ubi8</span>
<span class="k">FROM</span><span class="s"> registry.access.redhat.com/ubi8-minimal:8.10</span>
</code></pre></div></div>

<h2 id="docker-自定义镜像">docker 自定义镜像</h2>

<p>可以在native基础上加上自己的软件 方便调查问题：</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="s"> registry.access.redhat.com/ubi9/ubi-minimal:9.3</span>

<span class="c"># 设置非交互模式并安装工具</span>
<span class="c"># --releasever=9: 解决某些环境下无法识别版本的问题</span>
<span class="c"># --nodocs: 不安装文档，显著减小体积</span>
<span class="c"># clean all: 清理元数据缓存</span>
<span class="k">RUN </span>microdnf update <span class="nt">-y</span> <span class="nt">--releasever</span><span class="o">=</span>9 <span class="o">&amp;&amp;</span> <span class="se">\
</span>    microdnf <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--releasever</span><span class="o">=</span>9 <span class="nt">--nodocs</span> <span class="se">\
</span>        procps-ng <span class="se">\
</span>        net-tools <span class="se">\
</span>        wget <span class="se">\
</span>        vim-minimal <span class="o">&amp;&amp;</span> <span class="se">\
</span>    microdnf clean all <span class="nt">-y</span> <span class="nt">--releasever</span><span class="o">=</span>9 <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">rm</span> <span class="nt">-rf</span> /var/cache/yum
</code></pre></div></div>

<h3 id="grpc">grpc</h3>

<p>常见配置：</p>
<div class="language-properties highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="py">quarkus.http.port</span><span class="p">=</span><span class="s">8077</span>
<span class="py">quarkus.grpc.server.port</span><span class="p">=</span><span class="s">9097</span>
<span class="py">quarkus.grpc.server.host</span><span class="p">=</span><span class="s">0.0.0.0</span>
<span class="c"># Reflection (grpc.reflection.v1 / v1alpha) for grpcurl, Postman, etc. Dev mode enables it automatically;
# in prod it is off unless you set GRPC_SERVER_ENABLE_REFLECTION=true (reflection exposes service/schema info).
</span><span class="py">quarkus.grpc.server.enable-reflection-service</span><span class="p">=</span><span class="s">true</span>
<span class="py">quarkus.grpc.server.use-separate-server</span><span class="p">=</span><span class="s">true</span>

<span class="c"># Index external JARs so Jandex sees gRPC ImplBase  BindableService; without this,
# prod/native finds zero bindable services and Quarkus skips starting the gRPC server
# (dev mode still wires server support, which hides the issue locally).
</span><span class="py">quarkus.index-dependency.business-protocol.group-id</span><span class="p">=</span><span class="s">cn.sichuancredit.datasource.business</span>
<span class="py">quarkus.index-dependency.business-protocol.artifact-id</span><span class="p">=</span><span class="s">business-protocol</span>

<span class="c"># Logging gRPC client (@GrpcClient("logging")) — set LOGGING_GRPC_HOST, LOGGING_GRPC_PORT
</span><span class="py">quarkus.grpc.clients.logging.host</span><span class="p">=</span><span class="s">${LOGGING_GRPC_HOST:192.168.102.224}</span>
<span class="py">quarkus.grpc.clients.logging.port</span><span class="p">=</span><span class="s">${LOGGING_GRPC_PORT:6391}</span>
<span class="py">quarkus.grpc.clients.logging.plain-text</span><span class="p">=</span><span class="s">true</span>

</code></pre></div></div>

<p>注意：<br />
（1）如果你实现了某个grpc服务，<code class="language-plaintext highlighter-rouge">quarkus-index-dependency</code> 这个需要设置，里面的内容就是你的服务所在的maven group 和 artifcat。 <strong>这个只在native模式有影响</strong>，不设置的话服务不会正常启动。<br />
（2）打开：<code class="language-plaintext highlighter-rouge">quarkus.grpc.server.enable-reflection-service=true</code> 方便的你grpc 客户端可以通过reflection自动获取相关的定义。</p>

<h4 id="移除grpc依赖避免log4j-引入">移除grpc依赖避免log4j 引入</h4>

<p>移除log4j的依赖避免native失败:</p>

<p>问题：</p>

<p>Caused by:  com.oracle.graal.pointsto.constraints.UnsupportedFeatureException:  Discovered unresolved type during parsing:  io.grpc.netty.shaded.io.netty.util.internal.logging.Log4J2Logger. This  error is reported at image build time because class  io.grpc.netty.shaded.io.netty.util.internal.logging.Log4J2LoggerFactory  is registered for linking at image build time by command line and  command line. Error encountered while parsing <a href="http://io.grpc.netty.shaded.io.netty.util.internal.logging.InternalLoggerFactory.newDefaultFactory(InternalLoggerFactory.java:42)">io.grpc.netty.shaded.io.netty.util.internal.logging.InternalLoggerFactory.newDefaultFactory(InternalLoggerFactory.java:42)</a></p>

<p>解决：</p>

<div class="language-groovy highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">java</span> <span class="o">{</span>
<span class="n">xxxxxx</span>
<span class="o">}</span>

<span class="n">configurations</span><span class="o">.</span><span class="na">all</span> <span class="o">{</span>
    <span class="n">exclude</span> <span class="nl">group:</span> <span class="s1">'io.grpc'</span><span class="o">,</span> <span class="nl">module:</span> <span class="s1">'grpc-netty-shaded'</span>
<span class="o">}</span>

<span class="n">repositories</span> <span class="o">{</span>
<span class="n">yyyyyy</span>
<span class="o">}</span>
</code></pre></div></div>

<h3 id="db">db</h3>

<h3 id="redis">redis</h3>
<h4 id="自定义key">自定义key：</h4>
<p>实现一个这样的<code class="language-plaintext highlighter-rouge">CacheKeyGenerator</code>即可：</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// 这个是忽略了参数中的第一个参数来组成cachekey：</span>
<span class="nd">@RegisterForReflection</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">CacheKeyGeneratorSkipFirstParam</span> <span class="kd">implements</span> <span class="nc">CacheKeyGenerator</span> <span class="o">{</span>

    <span class="kd">public</span> <span class="nf">CacheKeyGeneratorSkipFirstParam</span><span class="o">()</span> <span class="o">{</span>

    <span class="o">}</span>

    <span class="nd">@Override</span>
    <span class="kd">public</span> <span class="nc">Object</span> <span class="nf">generate</span><span class="o">(</span><span class="nc">Method</span> <span class="n">method</span><span class="o">,</span> <span class="nc">Object</span><span class="o">...</span> <span class="n">methodParams</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">StringBuilder</span> <span class="n">sb</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">StringBuilder</span><span class="o">();</span>
        <span class="n">sb</span><span class="o">.</span><span class="na">append</span><span class="o">(</span><span class="n">method</span><span class="o">.</span><span class="na">getName</span><span class="o">()).</span><span class="na">append</span><span class="o">(</span><span class="sc">'-'</span><span class="o">);</span>
        <span class="k">for</span> <span class="o">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">1</span><span class="o">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">methodParams</span><span class="o">.</span><span class="na">length</span><span class="o">;</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
            <span class="n">sb</span><span class="o">.</span><span class="na">append</span><span class="o">(</span><span class="n">methodParams</span><span class="o">[</span><span class="n">i</span><span class="o">]).</span><span class="na">append</span><span class="o">(</span><span class="sc">'-'</span><span class="o">);</span>
        <span class="o">}</span>
        <span class="k">return</span> <span class="n">sb</span><span class="o">.</span><span class="na">toString</span><span class="o">();</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p>注意必须有空的构造函数和注解：<code class="language-plaintext highlighter-rouge">@RegisterForReflection</code>
然后就可以：</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="cm">/**
     * Cache key matches legacy Spring {@code @Cacheable} (method + idCard + personName semantics via two keys only).
     */</span>
    <span class="nd">@CacheResult</span><span class="o">(</span><span class="n">cacheName</span> <span class="o">=</span> <span class="s">"zzdtec"</span><span class="o">,</span> <span class="n">keyGenerator</span> <span class="o">=</span> <span class="nc">CacheKeyGeneratorSkipFirstParam</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
    <span class="kd">public</span> <span class="nc">FetchResult</span> <span class="nf">load</span><span class="o">(</span><span class="nc">AccessLogContext</span> <span class="n">accessLogContext</span><span class="o">,</span> <span class="nc">String</span> <span class="n">idCard</span><span class="o">,</span> <span class="nc">String</span> <span class="n">personName</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">return</span> <span class="n">httpExecutor</span><span class="o">.</span><span class="na">fetch</span><span class="o">(</span><span class="n">accessLogContext</span><span class="o">,</span> <span class="n">idCard</span><span class="o">,</span> <span class="n">personName</span><span class="o">);</span>
    <span class="o">}</span>
</code></pre></div></div>
<p>得到的缓存key就是：cache:zzdtec:load-341224xxxxx-涛yyyy-</p>

<h4 id="配置">配置</h4>
<div class="language-properties highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># 配置密码相关
</span><span class="py">quarkus.redis.hosts</span><span class="p">=</span><span class="s">${REDIS_HOSTS:redis://192.168.102.221:36379/13}</span>
<span class="py">quarkus.redis.password</span><span class="p">=</span><span class="s">${REDIS_PASSWORD:xxxxx}</span>

<span class="py">quarkus.cache.type</span><span class="p">=</span><span class="s">redis</span>
<span class="c"># 配置TTL
</span><span class="py">quarkus.cache.redis.zzdtec.expire-after-write</span><span class="p">=</span><span class="s">${REDIS_CACHE_EXPIRE:30d}</span>
<span class="c"># 还需要配置你的缓存的object 不然会失败。 同样的该类需要有相关注解：@RegisterForReflection
</span><span class="py">quarkus.cache.redis.zzdtec.value-type</span><span class="p">=</span><span class="s">cn.sichuancredit.zzdtec.server.api.FetchResult</span>

</code></pre></div></div>

<h2 id="参考链接">参考链接</h2>]]></content><author><name>ed</name></author><category term="开发" /><category term="quarkus" /><category term="quarkus native" /><summary type="html"><![CDATA[使用quarkus 、quarkus native的经验总结]]></summary></entry><entry><title type="html">Cursor中的excel &amp;amp; word mcp使用和配置</title><link href="https://gaoxingliang.github.io/blog/2026/03/26/cursor-mcp-excel/" rel="alternate" type="text/html" title="Cursor中的excel &amp;amp; word mcp使用和配置" /><published>2026-03-26T02:00:00+00:00</published><updated>2026-03-26T02:00:00+00:00</updated><id>https://gaoxingliang.github.io/blog/2026/03/26/cursor-mcp-excel</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2026/03/26/cursor-mcp-excel/"><![CDATA[<h2 id="excel-mcp">excel mcp</h2>
<h3 id="安装excel-mcp">安装excel-mcp</h3>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pip <span class="nb">install </span>excel-mcp
</code></pre></div></div>

<h3 id="配置">配置</h3>
<p>在.cursor/mcp.json中配置：</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
    </span><span class="nl">"mcpServers"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
  
      </span><span class="nl">"excel-mcp"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"command"</span><span class="p">:</span><span class="w"> </span><span class="s2">"python"</span><span class="p">,</span><span class="w"> 
        </span><span class="nl">"args"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"-m"</span><span class="p">,</span><span class="w"> </span><span class="s2">"excel_mcp"</span><span class="p">,</span><span class="w"> </span><span class="s2">"stdio"</span><span class="p">],</span><span class="w">
        </span><span class="nl">"env"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
          </span><span class="nl">"EXCEL_FILES_PATH"</span><span class="p">:</span><span class="w"> </span><span class="s2">"D:</span><span class="se">\\</span><span class="s2">code</span><span class="se">\\</span><span class="s2">xxx</span><span class="se">\\</span><span class="s2">quanfeng-end</span><span class="se">\\</span><span class="s2">analysis</span><span class="se">\\</span><span class="s2">"</span><span class="w">
        </span><span class="p">},</span><span class="w">
        </span><span class="nl">"transport"</span><span class="p">:</span><span class="w"> </span><span class="s2">"stdio"</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
  
</span></code></pre></div></div>
<p>通过环境变量<code class="language-plaintext highlighter-rouge">EXCEL_FILES_PATH</code>配置对应的excel文件路径</p>

<h2 id="word-mcp">word mcp</h2>
<h3 id="安装uv">安装uv</h3>
<p>从这里下载： <a href="https://release-assets.githubusercontent.com/github-production-release-asset/699532645/3da0a768-dcf3-45aa-8334-5736f9fa84e5?sp=r&amp;sv=2018-11-09&amp;sr=b&amp;spr=https&amp;se=2026-03-26T08%3A49%3A40Z&amp;rscd=attachment%3B+filename%3Duv-x86_64-pc-windows-msvc.zip&amp;rsct=application%2Foctet-stream&amp;skoid=96c2d410-5711-43a1-aedd-ab1947aa7ab0&amp;sktid=398a6654-997b-47e9-b12b-9515b896b4de&amp;skt=2026-03-26T07%3A49%3A39Z&amp;ske=2026-03-26T08%3A49%3A40Z&amp;sks=b&amp;skv=2018-11-09&amp;sig=JoRDh8cZ6NzNLuZ7PPjRnhUrYgAD9K4jl0GnCyWf45E%3D&amp;jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmVsZWFzZS1hc3NldHMuZ2l0aHVidXNlcmNvbnRlbnQuY29tIiwia2V5Ijoia2V5MSIsImV4cCI6MTc3NDUxMzM1NCwibmJmIjoxNzc0NTExNTU0LCJwYXRoIjoicmVsZWFzZWFzc2V0cHJvZHVjdGlvbi5ibG9iLmNvcmUud2luZG93cy5uZXQifQ.a1Si4YHQ06byrL73-pBh_H_rWaoQnjJCTB3sazx9bCU&amp;response-content-disposition=attachment%3B%20filename%3Duv-x86_64-pc-windows-msvc.zip&amp;response-content-type=application%2Foctet-stream">uv windows</a>
然后解压后配置环境变量：<br /></p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PATH里面加上：D:<span class="se">\s</span>ofts<span class="se">\u</span>v<span class="se">\</span>
额外添加：
<span class="nv">UV_DEFAULT_INDEX</span><span class="o">=</span>https://pypi.tuna.tsinghua.edu.cn/simple
</code></pre></div></div>
<p>然后手动安装相关依赖：</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>D:<span class="se">\s</span>ofts<span class="se">\u</span>v<span class="se">\u</span>vx.exe <span class="nt">--from</span> office-word-mcp-server word_mcp_server
</code></pre></div></div>

<h3 id="cursor-里面配置">cursor 里面配置</h3>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="w">    </span><span class="nl">"word-document-server"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"command"</span><span class="p">:</span><span class="w"> </span><span class="s2">"D:</span><span class="se">\\</span><span class="s2">softs</span><span class="se">\\</span><span class="s2">uv</span><span class="se">\\</span><span class="s2">uvx.exe"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"args"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"--from"</span><span class="p">,</span><span class="w"> </span><span class="s2">"office-word-mcp-server"</span><span class="p">,</span><span class="w"> </span><span class="s2">"word_mcp_server"</span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h2 id="参考链接">参考链接</h2>
<ul>
  <li><a href="https://github.com/haris-musa/excel-mcp-server">github excel mcp server</a></li>
  <li><a href="https://github.com/GongRzhe/Office-Word-MCP-Server">word mcp</a></li>
</ul>]]></content><author><name>ed</name></author><category term="开发" /><category term="MCP" /><category term="Cursor" /><summary type="html"><![CDATA[在cursor中使用excel 和 excel mcp来方便word excel的理解和读取。]]></summary></entry><entry><title type="html">cursor项目级mcp配置和excel mcp</title><link href="https://gaoxingliang.github.io/blog/2026/03/06/dbhub-docker-cursor-mcp/" rel="alternate" type="text/html" title="cursor项目级mcp配置和excel mcp" /><published>2026-03-06T02:00:00+00:00</published><updated>2026-03-06T02:00:00+00:00</updated><id>https://gaoxingliang.github.io/blog/2026/03/06/dbhub-docker-cursor-mcp</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2026/03/06/dbhub-docker-cursor-mcp/"><![CDATA[<h2 id="什么是-dbhub">什么是 DBHub？</h2>

<p>DBHub 支持 PostgreSQL、MySQL、SQL Server、MariaDB 和 SQLite 等多种数据库，主要特性包括：</p>

<p>核心 MCP 工具：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">execute_sql</code>：执行 SQL 查询，支持事务和安全控制</li>
  <li><code class="language-plaintext highlighter-rouge">search_objects</code>：搜索和浏览数据库 schema、表、列、索引和存储过程</li>
</ul>

<hr />

<h2 id="一docker-部署-dbhub">一、Docker 部署 DBHub</h2>

<h3 id="1-使用-docker-run">1. 使用 Docker Run</h3>

<p><strong>连接 mysql示例：</strong></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="nt">-d</span> <span class="nt">--restart</span> always <span class="nt">--init</span> <span class="se">\</span>
  <span class="nt">--name</span> dbhub <span class="se">\</span>
  <span class="nt">--publish</span> 7080:7080 <span class="se">\</span>
  bytebase/dbhub <span class="se">\</span>
  <span class="nt">--transport</span> http <span class="se">\</span>
  <span class="nt">--port</span> 7080 <span class="se">\</span>
  <span class="nt">--dsn</span> <span class="s2">"mysql://readonly_admin:your_strong_password@192.168.102.207:3307/yourdb"</span>

</code></pre></div></div>

<p>创建mysql使用的只读用户：</p>

<pre><code class="language-mysql">-- 1. 创建用户（替换 your_password）
CREATE USER 'readonly_admin'@'%' IDENTIFIED BY 'your_strong_password';

-- 2. 授予全局 SELECT 权限（所有库、所有表可查）
GRANT SELECT ON *.* TO 'readonly_admin'@'%';

-- 3. 授予元数据查看权限（关键！）
GRANT
    SHOW DATABASES,
        SHOW VIEW,
        PROCESS,          -- 查看当前运行的查询（用于 performance_schema）
        REPLICATION CLIENT -- 查看 binlog 位置（可选）
        ON *.* TO 'readonly_admin'@'%';

-- 4. （可选）允许执行存储过程（但不能修改）
GRANT EXECUTE ON *.* TO 'readonly_admin'@'%';

-- 5. 刷新权限
FLUSH PRIVILEGES;
</code></pre>

<p>部署成功后，DBHub 会在 <code class="language-plaintext highlighter-rouge">http://localhost:7080</code> 提供：</p>

<ul>
  <li><strong>工作台</strong>：<code class="language-plaintext highlighter-rouge">http://localhost:7080/</code></li>
  <li><strong>MCP 端点</strong>：<code class="language-plaintext highlighter-rouge">http://localhost:7080/mcp</code></li>
</ul>

<hr />

<h2 id="二cursor-中的-mcp-配置">二、Cursor 中的 MCP 配置</h2>

<p>Cursor 支持两种连接方式：<strong>stdio</strong>（本地）和 <strong>HTTP</strong>（远程/共享）。</p>

<h3 id="方式一http-连接推荐配合-docker">方式一：HTTP 连接（推荐，配合 Docker）</h3>

<p>当 DBHub 以 HTTP 方式运行（如 Docker 部署）时，在 Cursor 中配置：</p>

<p><strong>Windows</strong> - 编辑 <code class="language-plaintext highlighter-rouge">%USERPROFILE%\.cursor\mcp.json</code>：</p>

<p><strong>macOS/Linux</strong> - 编辑 <code class="language-plaintext highlighter-rouge">~/.cursor/mcp.json</code>：</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"mcpServers"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"dbhub"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"url"</span><span class="p">:</span><span class="w"> </span><span class="s2">"http://localhost:7080/mcp"</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<hr />

<h2 id="三验证与使用">三、验证与使用</h2>

<ol>
  <li>
    <p>保存 <code class="language-plaintext highlighter-rouge">mcp.json</code> 后，重启 Cursor 或重新加载窗口</p>
  </li>
  <li>
    <p>在 <strong>Cursor 设置 → Tools &amp; MCP</strong> 中确认 DBHub 已加载</p>

    <p><img src="/assets/images/posts/2025-03-04-dbhub-docker-cursor-mcp/image-20260304135751925.png" alt="MCP 设置界面" /></p>
  </li>
  <li>
    <p>在对话中可尝试：</p>
    <ul>
      <li>「数据库里有哪些 schema？」</li>
      <li>「public schema 下有哪些表？」</li>
      <li>「查询薪资最高的 5 名员工」</li>
    </ul>
  </li>
</ol>

<p>AI 会通过 DBHub 的 MCP 工具访问数据库并执行查询。</p>

<hr />

<h2 id="四参考链接">四、参考链接</h2>

<ul>
  <li><a href="https://dbhub.ai/">DBHub 官方文档</a></li>
  <li><a href="https://github.com/bytebase/dbhub">DBHub GitHub</a></li>
  <li><a href="https://cursor.com/docs/context/mcp">Cursor MCP 文档</a></li>
</ul>]]></content><author><name>ed</name></author><category term="运维" /><category term="MCP" /><category term="Cursor" /><summary type="html"><![CDATA[cursor项目级mcp配置和excel mcp]]></summary></entry><entry><title type="html">【翻译】我用 PostgreSQL 替换了 Redis（而且更快）</title><link href="https://gaoxingliang.github.io/blog/2026/01/26/postgresql-redis-157397573/" rel="alternate" type="text/html" title="【翻译】我用 PostgreSQL 替换了 Redis（而且更快）" /><published>2026-01-26T08:53:49+00:00</published><updated>2026-01-26T08:53:49+00:00</updated><id>https://gaoxingliang.github.io/blog/2026/01/26/postgresql-redis-157397573</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2026/01/26/postgresql-redis-157397573/"><![CDATA[<h4 id="文章目录">文章目录</h4>

<ul>
  <li><a href="#_1">引言</a></li>
  <li><a href="#_PostgreSQL__Redis_5">我用 PostgreSQL 替换了 Redis（而且更快）</a></li>
  <li>
    <ul>
      <li><a href="#_Redis__20">设置：我之前用 Redis 做什么</a></li>
      <li>
        <ul>
          <li><a href="#1_70__24">1. 缓存（70% 的使用量）</a></li>
          <li><a href="#2_20__31">2. 发布/订阅（20% 的使用量）</a></li>
          <li><a href="#3_10__38">3. 后台任务队列（10% 的使用量）</a></li>
        </ul>
      </li>
      <li><a href="#_Redis_54">为什么我考虑替换 Redis</a></li>
      <li>
        <ul>
          <li><a href="#_1_56">原因 #1：成本</a></li>
          <li><a href="#_2_70">原因 #2：运维复杂性</a></li>
          <li><a href="#_3_93">原因 #3：数据一致性</a></li>
        </ul>
      </li>
      <li><a href="#PostgreSQL__1_UNLOGGED__113">PostgreSQL 功能 #1：使用 UNLOGGED 表进行缓存</a></li>
      <li>
        <ul>
          <li><a href="#_UNLOGGED_156">什么是 UNLOGGED？</a></li>
        </ul>
      </li>
      <li><a href="#PostgreSQL__2_LISTENNOTIFY__175">PostgreSQL 功能 #2：使用 LISTEN/NOTIFY 进行发布/订阅</a></li>
      <li>
        <ul>
          <li><a href="#Redis__181">Redis 发布/订阅</a></li>
          <li><a href="#PostgreSQL__191">PostgreSQL 发布/订阅</a></li>
          <li><a href="#_254">真实世界示例：实时日志流</a></li>
        </ul>
      </li>
      <li><a href="#PostgreSQL__3_SKIP_LOCKED__336">PostgreSQL 功能 #3：使用 SKIP LOCKED 的任务队列</a></li>
      <li><a href="#PostgreSQL__4_436">PostgreSQL 功能 #4：速率限制</a></li>
      <li><a href="#PostgreSQL__5_JSONB__515">PostgreSQL 功能 #5：使用 JSONB 的会话</a></li>
      <li><a href="#_564">真实世界基准测试</a></li>
      <li>
        <ul>
          <li><a href="#_568">测试设置</a></li>
          <li><a href="#_574">结果</a></li>
          <li><a href="#_590">组合操作（真正的优势）</a></li>
        </ul>
      </li>
      <li><a href="#_Redis_618">何时保留 Redis</a></li>
      <li>
        <ul>
          <li><a href="#1__622">1. 你需要极致性能</a></li>
          <li><a href="#2__Redis__631">2. 你使用 Redis 特定的数据结构</a></li>
          <li><a href="#3__653">3. 你有独立的缓存层要求</a></li>
        </ul>
      </li>
      <li><a href="#_659">迁移策略</a></li>
      <li>
        <ul>
          <li><a href="#_1_1__663">阶段 1：并行运行（第 1 周）</a></li>
          <li><a href="#_2_Postgres__2__676">阶段 2：从 Postgres 读取（第 2 周）</a></li>
          <li><a href="#_3_Postgres_3__692">阶段 3：只写入 Postgres（第 3 周）</a></li>
          <li><a href="#_4_Redis_4__702">阶段 4：移除 Redis（第 4 周）</a></li>
        </ul>
      </li>
      <li><a href="#_712">代码示例：完整实现</a></li>
      <li>
        <ul>
          <li><a href="#PostgreSQL_714">缓存模块（PostgreSQL）</a></li>
          <li><a href="#_760">发布/订阅模块</a></li>
          <li><a href="#_840">任务队列模块</a></li>
        </ul>
      </li>
      <li><a href="#_960">性能调优技巧</a></li>
      <li>
        <ul>
          <li><a href="#1__962">1. 使用连接池</a></li>
          <li><a href="#2__996">2. 添加适当的索引</a></li>
          <li><a href="#3__PostgreSQL__1004">3. 调整 PostgreSQL 配置</a></li>
          <li><a href="#4__1014">4. 定期维护</a></li>
        </ul>
      </li>
      <li><a href="#3__1027">结果：3 个月后</a></li>
      <li><a href="#_1047">决策矩阵</a></li>
      <li><a href="#_1067">资源</a></li>
      <li><a href="#TLDR_1087">TL;DR</a></li>
    </ul>
  </li>
</ul>

<h2 id="引言">引言</h2>

<p>本文很好的介绍了使用pg替换redis，虽然单个操作变慢了，但是结合常用业务的整体操作却变快了，是个路子。</p>

<p><a href="https://dev.to/polliog/i-replaced-redis-with-postgresql-and-its-faster-4942">原文</a></p>

<h2 id="我用-postgresql-替换了-redis而且更快">我用 PostgreSQL 替换了 Redis（而且更快）</h2>

<p>我有一个典型的 Web 应用技术栈：</p>

<ul>
  <li>PostgreSQL 用于持久化数据</li>
  <li>Redis 用于缓存、发布/订阅和后台任务</li>
</ul>

<p><strong>两个数据库。两套东西要管理。两个故障点。</strong></p>

<p>然后我意识到：<strong>PostgreSQL 可以做 Redis 做的所有事情。</strong></p>

<p>我完全移除了 Redis。以下是发生的事情。</p>

<hr />

<h3 id="设置我之前用-redis-做什么">设置：我之前用 Redis 做什么</h3>

<p>在改变之前，Redis 处理三件事：</p>

<h4 id="1-缓存70-的使用量">1. 缓存（70% 的使用量）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 缓存 API 响应
redis-cli SET "user:${id}" '{"id":123,"name":"John"}' EX 3600
</code></pre></div></div>

<h4 id="2-发布订阅20-的使用量">2. 发布/订阅（20% 的使用量）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 实时通知
redis-cli PUBLISH notifications '{"userId":123,"message":"Hello"}'
</code></pre></div></div>

<h4 id="3-后台任务队列10-的使用量">3. 后台任务队列（10% 的使用量）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 使用 Bull/BullMQ（这里展示 Redis CLI 的基础操作）
redis-cli LPUSH queue:send-email '{"to":"user@example.com","subject":"Hi"}'
</code></pre></div></div>

<p><strong>痛点：</strong></p>

<ul>
  <li>两个数据库要备份</li>
  <li>Redis 使用内存（大规模时很昂贵）</li>
  <li>Redis 持久化…很复杂</li>
  <li>Postgres 和 Redis 之间的网络跳转</li>
</ul>

<hr />

<h3 id="为什么我考虑替换-redis">为什么我考虑替换 Redis</h3>

<h4 id="原因-1成本">原因 #1：成本</h4>

<p><strong>我的 Redis 设置：</strong></p>

<ul>
  <li>AWS ElastiCache：$45/月（2GB）</li>
  <li>增长到 5GB 将花费 $110/月</li>
</ul>

<p><strong>PostgreSQL：</strong></p>

<ul>
  <li>已经为 RDS 付费：$50/月（20GB 存储）</li>
  <li>增加 5GB 数据：$0.50/月</li>
</ul>

<p><strong>潜在节省：</strong> 约 $100/月</p>

<h4 id="原因-2运维复杂性">原因 #2：运维复杂性</h4>

<p><strong>使用 Redis：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Postgres 备份 ✅
Redis 备份 ❓（RDB？AOF？两者都要？）
Postgres 监控 ✅
Redis 监控 ❓
Postgres 故障转移 ✅
Redis Sentinel/Cluster ❓
</code></pre></div></div>

<p><strong>不使用 Redis：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Postgres 备份 ✅
Postgres 监控 ✅
Postgres 故障转移 ✅
</code></pre></div></div>

<p>少一个移动部件。</p>

<h4 id="原因-3数据一致性">原因 #3：数据一致性</h4>

<p><strong>经典问题：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 更新数据库
psql -c "UPDATE users SET name = 'John' WHERE id = 123;"

# 使缓存失效
redis-cli DEL "user:123"

# ⚠️ 如果 Redis 宕机了怎么办？
# ⚠️ 如果这个操作失败了怎么办？
# 现在缓存和数据库不同步了
</code></pre></div></div>

<p>使用 Postgres 处理一切：<strong>事务解决了这个问题。</strong></p>

<hr />

<h3 id="postgresql-功能-1使用-unlogged-表进行缓存">PostgreSQL 功能 #1：使用 UNLOGGED 表进行缓存</h3>

<p><strong>Redis：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>redis-cli SET "session:abc123" '{"userId":123,"role":"admin"}' EX 3600
</code></pre></div></div>

<p><strong>PostgreSQL：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE UNLOGGED TABLE cache (
  key TEXT PRIMARY KEY,
  value JSONB NOT NULL,
  expires_at TIMESTAMPTZ NOT NULL
);

CREATE INDEX idx_cache_expires ON cache(expires_at);
</code></pre></div></div>

<p><strong>插入：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>INSERT INTO cache (key, value, expires_at)
VALUES ('user:123', '{"id":123,"name":"John"}'::jsonb, NOW() + INTERVAL '1 hour')
ON CONFLICT (key) DO UPDATE
  SET value = EXCLUDED.value,
      expires_at = EXCLUDED.expires_at;
</code></pre></div></div>

<p><strong>读取：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT value FROM cache
WHERE key = 'user:123' AND expires_at &gt; NOW();
</code></pre></div></div>

<p><strong>清理（定期运行）：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DELETE FROM cache WHERE expires_at &lt; NOW();
</code></pre></div></div>

<h4 id="什么是-unlogged">什么是 UNLOGGED？</h4>

<p><strong>UNLOGGED 表：</strong></p>

<ul>
  <li>跳过预写日志（WAL）</li>
  <li>写入速度更快</li>
  <li>崩溃后不保留（非常适合缓存！）</li>
</ul>

<p><strong>性能：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Redis SET: 0.05ms
Postgres UNLOGGED INSERT: 0.08ms
</code></pre></div></div>

<p><strong>对于缓存来说足够接近。</strong></p>

<hr />

<h3 id="postgresql-功能-2使用-listennotify-进行发布订阅">PostgreSQL 功能 #2：使用 LISTEN/NOTIFY 进行发布/订阅</h3>

<p><strong>这里变得有趣了。</strong></p>

<p>PostgreSQL 有<strong>原生发布/订阅</strong>功能，大多数开发者都不知道。</p>

<h4 id="redis-发布订阅">Redis 发布/订阅</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 发布者
redis-cli PUBLISH notifications '{"userId":123,"msg":"Hello"}'

# 订阅者（在另一个终端）
redis-cli SUBSCRIBE notifications
</code></pre></div></div>

<h4 id="postgresql-发布订阅">PostgreSQL 发布/订阅</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 发布者
NOTIFY notifications, '{"userId": 123, "msg": "Hello"}';
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// 订阅者（Java with PostgreSQL JDBC）
import org.postgresql.PGConnection;
import org.postgresql.PGNotification;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.Statement;
import java.util.Properties;
import com.fasterxml.jackson.databind.ObjectMapper;

// 建立连接
String url = System.getenv("DATABASE_URL");
Properties props = new Properties();
Connection conn = DriverManager.getConnection(url, props);
Statement stmt = conn.createStatement();
stmt.execute("LISTEN notifications");
stmt.close();

// 获取 PGConnection 以接收通知
PGConnection pgConn = conn.unwrap(PGConnection.class);
org.postgresql.PGNotification[] notifications = pgConn.getNotifications();

// 在单独的线程中监听通知
new Thread(() -&gt; {
    while (true) {
        try {
            PGNotification[] notifications = pgConn.getNotifications();
            if (notifications != null) {
                for (PGNotification notification : notifications) {
                    String payload = notification.getParameter();
                    ObjectMapper mapper = new ObjectMapper();
                    Map&lt;String, Object&gt; data = mapper.readValue(payload, Map.class);
                    System.out.println(data);
                }
            }
            Thread.sleep(500);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}).start();
</code></pre></div></div>

<p><strong>性能比较：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Redis pub/sub 延迟：1-2ms
Postgres NOTIFY 延迟：2-5ms
</code></pre></div></div>

<p><strong>稍慢一些，但是：</strong></p>

<ul>
  <li>无需额外基础设施</li>
  <li>可以在事务中使用</li>
  <li>可以与查询结合使用</li>
</ul>

<h4 id="真实世界示例实时日志流">真实世界示例：实时日志流</h4>

<p>在我的日志管理应用中，我需要<strong>实时日志流</strong>。</p>

<p><strong>使用 Redis：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 当新日志到达时
psql -c "INSERT INTO logs ..."
redis-cli PUBLISH logs:new '{"id":123,"message":"..."}'

# 前端监听
redis-cli SUBSCRIBE logs:new
</code></pre></div></div>

<p><strong>问题：</strong> 两个操作。如果发布失败怎么办？</p>

<p><strong>使用 PostgreSQL：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE FUNCTION notify_new_log() RETURNS TRIGGER AS $$
BEGIN
  PERFORM pg_notify('logs_new', row_to_json(NEW)::text);
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER log_inserted
AFTER INSERT ON logs
FOR EACH ROW EXECUTE FUNCTION notify_new_log();
</code></pre></div></div>

<p>现在是<strong>原子性的</strong>。插入和通知一起发生，或者都不发生。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// 前端（通过 SSE）- Spring Boot 示例
import org.springframework.http.MediaType;
import org.springframework.web.servlet.mvc.method.annotation.SseEmitter;
import org.postgresql.PGConnection;
import org.postgresql.PGNotification;
import java.sql.Connection;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

@GetMapping(value = "/logs/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public SseEmitter streamLogs() {
    SseEmitter emitter = new SseEmitter(Long.MAX_VALUE);
    ExecutorService executor = Executors.newSingleThreadExecutor();
    
    executor.execute(() -&gt; {
        try {
            Connection conn = dataSource.getConnection();
            Statement stmt = conn.createStatement();
            stmt.execute("LISTEN logs_new");
            stmt.close();
            
            PGConnection pgConn = conn.unwrap(PGConnection.class);
            
            while (true) {
                PGNotification[] notifications = pgConn.getNotifications();
                if (notifications != null) {
                    for (PGNotification notification : notifications) {
                        String payload = notification.getParameter();
                        emitter.send(SseEmitter.event()
                            .data("data: " + payload + "\n\n"));
                    }
                }
                Thread.sleep(100);
            }
        } catch (Exception e) {
            emitter.completeWithError(e);
        }
    });
    
    return emitter;
}
</code></pre></div></div>

<p><strong>结果：</strong> 零 Redis 的实时日志流。</p>

<hr />

<h3 id="postgresql-功能-3使用-skip-locked-的任务队列">PostgreSQL 功能 #3：使用 SKIP LOCKED 的任务队列</h3>

<p><strong>Redis（使用 Bull/BullMQ）：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 入队
redis-cli LPUSH queue:send-email '{"to":"user@example.com","subject":"Hi"}'

# 出队（使用阻塞操作）
redis-cli BRPOP queue:send-email 5
</code></pre></div></div>

<p><strong>PostgreSQL：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE jobs (
  id BIGSERIAL PRIMARY KEY,
  queue TEXT NOT NULL,
  payload JSONB NOT NULL,
  attempts INT DEFAULT 0,
  max_attempts INT DEFAULT 3,
  scheduled_at TIMESTAMPTZ DEFAULT NOW(),
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_jobs_queue ON jobs(queue, scheduled_at) 
WHERE attempts &lt; max_attempts;
</code></pre></div></div>

<p><strong>入队：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>INSERT INTO jobs (queue, payload)
VALUES ('send-email', '{"to": "user@example.com", "subject": "Hi"}'::jsonb);
</code></pre></div></div>

<p><strong>工作进程（出队）：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>WITH next_job AS (
  SELECT id FROM jobs
  WHERE queue = 'send-email'
    AND attempts &lt; max_attempts
    AND scheduled_at &lt;= NOW()
  ORDER BY scheduled_at
  LIMIT 1
  FOR UPDATE SKIP LOCKED
)
UPDATE jobs
SET attempts = attempts + 1
FROM next_job
WHERE jobs.id = next_job.id
RETURNING *;
</code></pre></div></div>

<p><strong>魔法：<code class="language-plaintext highlighter-rouge">FOR UPDATE SKIP LOCKED</code></strong></p>

<p>这使得 PostgreSQL 成为一个<strong>无锁队列</strong>：</p>

<ul>
  <li>多个工作进程可以并发拉取任务</li>
  <li>没有任务被处理两次</li>
  <li>如果工作进程崩溃，任务会再次变为可用</li>
</ul>

<p><strong>性能：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Redis BRPOP: 0.1ms
Postgres SKIP LOCKED: 0.3ms
</code></pre></div></div>

<p><strong>对于大多数工作负载来说差异可忽略。</strong><br />
 【译注】：下面的例子可能更简单：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>BEGIN;

-- 取出 1 个 pending 状态的任务，加锁并跳过已锁定的
UPDATE tasks 
SET status = 'processing'
WHERE id = (
    SELECT id 
    FROM tasks 
    WHERE status = 'pending'
    ORDER BY id
    LIMIT 1  --- 可以LIMIT 10来达到批量拉取的效果
    FOR UPDATE SKIP LOCKED  -- 👈 核心在这里
)
RETURNING *;

COMMIT;

--- 需要考虑对僵尸任务的释放
-- 任务处理超时 5 分钟，自动释放
UPDATE tasks 
SET status = 'pending'
WHERE status = 'processing'
AND updated_at &lt; NOW() - INTERVAL '5 minutes';
</code></pre></div></div>

<hr />

<h3 id="postgresql-功能-4速率限制">PostgreSQL 功能 #4：速率限制</h3>

<p><strong>Redis（经典速率限制器）：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 检查并增加计数
redis-cli INCR "ratelimit:${userId}"
redis-cli EXPIRE "ratelimit:${userId}" 60

# 检查是否超过限制
redis-cli GET "ratelimit:${userId}"
</code></pre></div></div>

<p><strong>PostgreSQL：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE rate_limits (
  user_id INT PRIMARY KEY,
  request_count INT DEFAULT 0,
  window_start TIMESTAMPTZ DEFAULT NOW()
);

-- 检查并增加
WITH current AS (
  SELECT 
    request_count,
    CASE 
      WHEN window_start &lt; NOW() - INTERVAL '1 minute'
      THEN 1  -- 重置计数器
      ELSE request_count + 1
    END AS new_count
  FROM rate_limits
  WHERE user_id = 123
  FOR UPDATE
)
UPDATE rate_limits
SET 
  request_count = (SELECT new_count FROM current),
  window_start = CASE
    WHEN window_start &lt; NOW() - INTERVAL '1 minute'
    THEN NOW()
    ELSE window_start
  END
WHERE user_id = 123
RETURNING request_count;
</code></pre></div></div>

<p><strong>或者使用窗口函数更简单：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE api_requests (
  user_id INT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- 检查速率限制
SELECT COUNT(*) FROM api_requests
WHERE user_id = 123
  AND created_at &gt; NOW() - INTERVAL '1 minute';

-- 如果在限制内，插入
INSERT INTO api_requests (user_id) VALUES (123);

-- 定期清理旧请求
DELETE FROM api_requests WHERE created_at &lt; NOW() - INTERVAL '5 minutes';
</code></pre></div></div>

<p><strong>Postgres 更好的时候：</strong></p>

<ul>
  <li>需要基于复杂逻辑进行速率限制（不仅仅是计数）</li>
  <li>希望速率限制数据与业务逻辑在同一事务中</li>
</ul>

<p><strong>Redis 更好的时候：</strong></p>

<ul>
  <li>需要亚毫秒级速率限制</li>
  <li>极高的吞吐量（每秒数百万请求）</li>
</ul>

<hr />

<h3 id="postgresql-功能-5使用-jsonb-的会话">PostgreSQL 功能 #5：使用 JSONB 的会话</h3>

<p><strong>Redis：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>redis-cli SET "session:${sessionId}" '{"userId":123,"role":"admin"}' EX 86400
</code></pre></div></div>

<p><strong>PostgreSQL：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE sessions (
  id TEXT PRIMARY KEY,
  data JSONB NOT NULL,
  expires_at TIMESTAMPTZ NOT NULL
);

CREATE INDEX idx_sessions_expires ON sessions(expires_at);

-- 插入/更新
INSERT INTO sessions (id, data, expires_at)
VALUES ('abc123', '{"userId":123,"role":"admin"}'::jsonb, NOW() + INTERVAL '24 hours')
ON CONFLICT (id) DO UPDATE
  SET data = EXCLUDED.data,
      expires_at = EXCLUDED.expires_at;

-- 读取
SELECT data FROM sessions
WHERE id = 'abc123' AND expires_at &gt; NOW();
</code></pre></div></div>

<p><strong>奖励：JSONB 操作符</strong></p>

<p>你可以查询会话内部：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 查找特定用户的所有会话
SELECT * FROM sessions
WHERE data-&gt;&gt;'userId' = '123';

-- 查找具有特定角色的会话
SELECT * FROM sessions
WHERE data-&gt;'user'-&gt;&gt;'role' = 'admin';
</code></pre></div></div>

<p><strong>使用 Redis 无法做到这一点！</strong></p>

<hr />

<h3 id="真实世界基准测试">真实世界基准测试</h3>

<p>我在生产数据集上运行了基准测试：</p>

<h4 id="测试设置">测试设置</h4>

<ul>
  <li><strong>硬件：</strong> AWS RDS db.t3.medium（2 vCPU，4GB RAM）</li>
  <li><strong>数据集：</strong> 100 万缓存条目，1 万会话</li>
  <li><strong>工具：</strong> pgbench（自定义脚本）</li>
</ul>

<h4 id="结果">结果</h4>

<table>
  <thead>
    <tr>
      <th>操作</th>
      <th>Redis</th>
      <th>PostgreSQL</th>
      <th>差异</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>缓存 SET</strong></td>
      <td>0.05ms</td>
      <td>0.08ms</td>
      <td>+60% 更慢</td>
    </tr>
    <tr>
      <td><strong>缓存 GET</strong></td>
      <td>0.04ms</td>
      <td>0.06ms</td>
      <td>+50% 更慢</td>
    </tr>
    <tr>
      <td><strong>发布/订阅</strong></td>
      <td>1.2ms</td>
      <td>3.1ms</td>
      <td>+158% 更慢</td>
    </tr>
    <tr>
      <td><strong>队列推送</strong></td>
      <td>0.08ms</td>
      <td>0.15ms</td>
      <td>+87% 更慢</td>
    </tr>
    <tr>
      <td><strong>队列弹出</strong></td>
      <td>0.12ms</td>
      <td>0.31ms</td>
      <td>+158% 更慢</td>
    </tr>
  </tbody>
</table>

<p><strong>PostgreSQL 更慢…但是：</strong></p>

<ul>
  <li>所有操作仍然在 1ms 以下</li>
  <li>消除了到 Redis 的网络跳转</li>
  <li>减少了基础设施复杂性</li>
</ul>

<h4 id="组合操作真正的优势">组合操作（真正的优势）</h4>

<p><strong>场景：</strong> 插入数据 + 使缓存失效 + 通知订阅者</p>

<p><strong>使用 Redis：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>psql -c "INSERT INTO posts ..."                    # 2ms
redis-cli DEL "posts:latest"                        # 1ms（网络跳转）
redis-cli PUBLISH posts:new '{"id":123}'            # 1ms（网络跳转）
# 总计：~4ms
</code></pre></div></div>

<p><strong>使用 PostgreSQL：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>BEGIN;
INSERT INTO posts ...;                              -- 2ms
DELETE FROM cache WHERE key = 'posts:latest';      -- 0.1ms（同一连接）
NOTIFY posts_new, '...';                            -- 0.1ms（同一连接）
COMMIT;
-- 总计：~2.2ms
</code></pre></div></div>

<p><strong>当操作组合时，PostgreSQL 更快。</strong></p>

<hr />

<h3 id="何时保留-redis">何时保留 Redis</h3>

<p><strong>如果以下情况，不要替换 Redis：</strong></p>

<h4 id="1-你需要极致性能">1. 你需要极致性能</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Redis: 100,000+ 操作/秒（单实例）
Postgres: 10,000-50,000 操作/秒
</code></pre></div></div>

<p>如果你每秒进行数百万次缓存读取，保留 Redis。</p>

<h4 id="2-你使用-redis-特定的数据结构">2. 你使用 Redis 特定的数据结构</h4>

<p><strong>Redis 有：</strong></p>

<ul>
  <li>有序集合（排行榜）</li>
  <li>HyperLogLog（唯一计数估计）</li>
  <li>地理空间索引</li>
  <li>流（高级发布/订阅）</li>
</ul>

<p><strong>Postgres 等价物存在但更笨拙：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Postgres 中的排行榜（更慢）
SELECT user_id, score
FROM leaderboard
ORDER BY score DESC
LIMIT 10;

-- vs Redis
redis-cli ZREVRANGE leaderboard 0 9 WITHSCORES
</code></pre></div></div>

<h4 id="3-你有独立的缓存层要求">3. 你有独立的缓存层要求</h4>

<p>如果你的架构要求独立的缓存层（例如微服务），保留 Redis。</p>

<hr />

<h3 id="迁移策略">迁移策略</h3>

<p><strong>不要一夜之间移除 Redis。</strong> 以下是我的做法：</p>

<h4 id="阶段-1并行运行第-1-周">阶段 1：并行运行（第 1 周）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// 写入两者
jedis.set(key, value);
jdbcTemplate.update("INSERT INTO cache ...", key, value);

// 从 Redis 读取（仍然是主要的）
String data = jedis.get(key);
</code></pre></div></div>

<p><strong>监控：</strong> 比较命中率、延迟。</p>

<h4 id="阶段-2从-postgres-读取第-2-周">阶段 2：从 Postgres 读取（第 2 周）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// 先尝试 Postgres
String data = jdbcTemplate.queryForObject(
    "SELECT value FROM cache WHERE key = ? AND expires_at &gt; NOW()",
    String.class, key);

// 回退到 Redis
if (data == null || data.isEmpty()) {
    data = jedis.get(key);
}
</code></pre></div></div>

<p><strong>监控：</strong> 错误率、性能。</p>

<h4 id="阶段-3只写入-postgres第-3-周">阶段 3：只写入 Postgres（第 3 周）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// 只写入 Postgres
jdbcTemplate.update("INSERT INTO cache (key, value, expires_at) VALUES (?, ?, ?)",
    key, value, LocalDateTime.now().plusHours(1));
</code></pre></div></div>

<p><strong>监控：</strong> 一切仍然正常工作？</p>

<h4 id="阶段-4移除-redis第-4-周">阶段 4：移除 Redis（第 4 周）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 关闭 Redis
# 观察错误
# 没有破坏？成功！
</code></pre></div></div>

<hr />

<h3 id="代码示例完整实现">代码示例：完整实现</h3>

<h4 id="缓存模块postgresql">缓存模块（PostgreSQL）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// PostgresCache.java
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Component;
import java.time.LocalDateTime;
import java.util.Map;

@Component
public class PostgresCache {
    private final JdbcTemplate jdbcTemplate;
    
    public PostgresCache(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
    }
    
    public String get(String key) {
        return jdbcTemplate.queryForObject(
            "SELECT value FROM cache WHERE key = ? AND expires_at &gt; NOW()",
            String.class, key);
    }
    
    public void set(String key, String value, int ttlSeconds) {
        jdbcTemplate.update(
            "INSERT INTO cache (key, value, expires_at) " +
            "VALUES (?, ?::jsonb, NOW() + INTERVAL ? || ' seconds') " +
            "ON CONFLICT (key) DO UPDATE " +
            "SET value = EXCLUDED.value, expires_at = EXCLUDED.expires_at",
            key, value, ttlSeconds);
    }
    
    public void set(String key, String value) {
        set(key, value, 3600);
    }
    
    public void delete(String key) {
        jdbcTemplate.update("DELETE FROM cache WHERE key = ?", key);
    }
    
    public void cleanup() {
        jdbcTemplate.update("DELETE FROM cache WHERE expires_at &lt; NOW()");
    }
}
</code></pre></div></div>

<h4 id="发布订阅模块">发布/订阅模块</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// PostgresPubSub.java
import org.postgresql.PGConnection;
import org.postgresql.PGNotification;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Component;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.sql.Connection;
import java.sql.Statement;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.function.Consumer;

@Component
public class PostgresPubSub {
    private final JdbcTemplate jdbcTemplate;
    private final ObjectMapper objectMapper;
    private final Map&lt;String, Connection&gt; listeners = new ConcurrentHashMap&lt;&gt;();
    private final ExecutorService executor = Executors.newCachedThreadPool();
    
    public PostgresPubSub(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
        this.objectMapper = new ObjectMapper();
    }
    
    public void publish(String channel, Object message) throws Exception {
        String payload = objectMapper.writeValueAsString(message);
        jdbcTemplate.update("SELECT pg_notify(?, ?)", channel, payload);
    }
    
    public void subscribe(String channel, Consumer&lt;Map&lt;String, Object&gt;&gt; callback) {
        executor.execute(() -&gt; {
            try {
                Connection conn = jdbcTemplate.getDataSource().getConnection();
                Statement stmt = conn.createStatement();
                stmt.execute("LISTEN " + channel);
                stmt.close();
                
                listeners.put(channel, conn);
                
                PGConnection pgConn = conn.unwrap(PGConnection.class);
                while (listeners.containsKey(channel)) {
                    PGNotification[] notifications = pgConn.getNotifications();
                    if (notifications != null) {
                        for (PGNotification notification : notifications) {
                            if (notification.getName().equals(channel)) {
                                Map&lt;String, Object&gt; data = objectMapper.readValue(
                                    notification.getParameter(), Map.class);
                                callback.accept(data);
                            }
                        }
                    }
                    Thread.sleep(100);
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        });
    }
    
    public void unsubscribe(String channel) {
        Connection conn = listeners.remove(channel);
        if (conn != null) {
            try {
                Statement stmt = conn.createStatement();
                stmt.execute("UNLISTEN " + channel);
                stmt.close();
                conn.close();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }
}
</code></pre></div></div>

<h4 id="任务队列模块">任务队列模块</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// PostgresQueue.java
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.jdbc.core.RowMapper;
import org.springframework.stereotype.Component;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.time.LocalDateTime;
import java.util.Map;

@Component
public class PostgresQueue {
    private final JdbcTemplate jdbcTemplate;
    private final ObjectMapper objectMapper;
    
    public PostgresQueue(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
        this.objectMapper = new ObjectMapper();
    }
    
    public void enqueue(String queue, Map&lt;String, Object&gt; payload, LocalDateTime scheduledAt) {
        try {
            String payloadJson = objectMapper.writeValueAsString(payload);
            jdbcTemplate.update(
                "INSERT INTO jobs (queue, payload, scheduled_at) VALUES (?, ?::jsonb, ?)",
                queue, payloadJson, scheduledAt);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    
    public void enqueue(String queue, Map&lt;String, Object&gt; payload) {
        enqueue(queue, payload, LocalDateTime.now());
    }
    
    public Job dequeue(String queue) {
        String sql = "WITH next_job AS (" +
            "  SELECT id FROM jobs " +
            "  WHERE queue = ? " +
            "    AND attempts &lt; max_attempts " +
            "    AND scheduled_at &lt;= NOW() " +
            "  ORDER BY scheduled_at " +
            "  LIMIT 1 " +
            "  FOR UPDATE SKIP LOCKED " +
            ") " +
            "UPDATE jobs " +
            "SET attempts = attempts + 1 " +
            "FROM next_job " +
            "WHERE jobs.id = next_job.id " +
            "RETURNING jobs.*";
        
        return jdbcTemplate.queryForObject(sql, new JobRowMapper(), queue);
    }
    
    public void complete(Long jobId) {
        jdbcTemplate.update("DELETE FROM jobs WHERE id = ?", jobId);
    }
    
    public void fail(Long jobId, Exception error) {
        try {
            String errorJson = objectMapper.writeValueAsString(Map.of("error", error.getMessage()));
            jdbcTemplate.update(
                "UPDATE jobs " +
                "SET attempts = max_attempts, " +
                "    payload = payload || ?::jsonb " +
                "WHERE id = ?",
                errorJson, jobId);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    
    private static class JobRowMapper implements RowMapper&lt;Job&gt; {
        @Override
        public Job mapRow(ResultSet rs, int rowNum) throws SQLException {
            Job job = new Job();
            job.setId(rs.getLong("id"));
            job.setQueue(rs.getString("queue"));
            job.setPayload(rs.getString("payload"));
            job.setAttempts(rs.getInt("attempts"));
            job.setMaxAttempts(rs.getInt("max_attempts"));
            job.setScheduledAt(rs.getTimestamp("scheduled_at").toLocalDateTime());
            job.setCreatedAt(rs.getTimestamp("created_at").toLocalDateTime());
            return job;
        }
    }
    
    public static class Job {
        private Long id;
        private String queue;
        private String payload;
        private Integer attempts;
        private Integer maxAttempts;
        private LocalDateTime scheduledAt;
        private LocalDateTime createdAt;
        
        // Getters and Setters
        public Long getId() { return id; }
        public void setId(Long id) { this.id = id; }
        public String getQueue() { return queue; }
        public void setQueue(String queue) { this.queue = queue; }
        public String getPayload() { return payload; }
        public void setPayload(String payload) { this.payload = payload; }
        public Integer getAttempts() { return attempts; }
        public void setAttempts(Integer attempts) { this.attempts = attempts; }
        public Integer getMaxAttempts() { return maxAttempts; }
        public void setMaxAttempts(Integer maxAttempts) { this.maxAttempts = maxAttempts; }
        public LocalDateTime getScheduledAt() { return scheduledAt; }
        public void setScheduledAt(LocalDateTime scheduledAt) { this.scheduledAt = scheduledAt; }
        public LocalDateTime getCreatedAt() { return createdAt; }
        public void setCreatedAt(LocalDateTime createdAt) { this.createdAt = createdAt; }
    }
}
</code></pre></div></div>

<hr />

<h3 id="性能调优技巧">性能调优技巧</h3>

<h4 id="1-使用连接池">1. 使用连接池</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// Spring Boot 配置（application.yml）
spring:
  datasource:
    url: jdbc:postgresql://localhost:5432/mydb
    username: user
    password: password
    hikari:
      maximum-pool-size: 20  # 最大连接数
      minimum-idle: 5        # 最小空闲连接数
      connection-timeout: 2000  # 连接超时（毫秒）
      idle-timeout: 30000    # 空闲超时（毫秒）
      max-lifetime: 1800000  # 连接最大生命周期（毫秒）

// 或者使用 Java 配置
@Configuration
public class DataSourceConfig {
    @Bean
    public DataSource dataSource() {
        HikariConfig config = new HikariConfig();
        config.setJdbcUrl("jdbc:postgresql://localhost:5432/mydb");
        config.setUsername("user");
        config.setPassword("password");
        config.setMaximumPoolSize(20);
        config.setMinimumIdle(5);
        config.setConnectionTimeout(2000);
        config.setIdleTimeout(30000);
        return new HikariDataSource(config);
    }
}
</code></pre></div></div>

<h4 id="2-添加适当的索引">2. 添加适当的索引</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE INDEX CONCURRENTLY idx_cache_key ON cache(key) WHERE expires_at &gt; NOW();
CREATE INDEX CONCURRENTLY idx_jobs_pending ON jobs(queue, scheduled_at) 
  WHERE attempts &lt; max_attempts;
</code></pre></div></div>

<h4 id="3-调整-postgresql-配置">3. 调整 PostgreSQL 配置</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># postgresql.conf
shared_buffers = 2GB           # RAM 的 25%
effective_cache_size = 6GB     # RAM 的 75%
work_mem = 50MB                # 用于复杂查询
maintenance_work_mem = 512MB   # 用于 VACUUM
</code></pre></div></div>

<h4 id="4-定期维护">4. 定期维护</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 每天运行
VACUUM ANALYZE cache;
VACUUM ANALYZE jobs;

-- 或启用 autovacuum（推荐）
ALTER TABLE cache SET (autovacuum_vacuum_scale_factor = 0.1);
</code></pre></div></div>

<hr />

<h3 id="结果3-个月后">结果：3 个月后</h3>

<p><strong>我节省了什么：</strong></p>

<ul>
  <li>✅ $100/月（不再需要 ElastiCache）</li>
  <li>✅ 备份复杂性减少 50%</li>
  <li>✅ 少一个服务要监控</li>
  <li>✅ 更简单的部署（少一个依赖）</li>
</ul>

<p><strong>我失去了什么：</strong></p>

<ul>
  <li>❌ 缓存操作约 0.5ms 延迟</li>
  <li>❌ Redis 的异域数据结构（我不需要它们）</li>
</ul>

<p><strong>我会再做一次吗？</strong> 是的，对于这个用例。</p>

<p><strong>我会普遍推荐吗？</strong> 不会。</p>

<hr />

<h3 id="决策矩阵">决策矩阵</h3>

<p><strong>用 Postgres 替换 Redis，如果：</strong></p>

<ul>
  <li>✅ 你使用 Redis 进行简单缓存/会话</li>
  <li>✅ 缓存命中率 &lt; 95%（大量写入）</li>
  <li>✅ 你想要事务一致性</li>
  <li>✅ 你可以接受 0.1-1ms 更慢的操作</li>
  <li>✅ 你是一个小团队，运维资源有限</li>
</ul>

<p><strong>保留 Redis，如果：</strong></p>

<ul>
  <li>❌ 你需要 100k+ 操作/秒</li>
  <li>❌ 你使用 Redis 数据结构（有序集合等）</li>
  <li>❌ 你有专门的运维团队</li>
  <li>❌ 亚毫秒延迟至关重要</li>
  <li>❌ 你正在进行地理复制</li>
</ul>

<hr />

<h3 id="资源">资源</h3>

<p><strong>PostgreSQL 功能：</strong></p>

<ul>
  <li><a href="https://www.postgresql.org/docs/current/sql-notify.html">LISTEN/NOTIFY 文档</a></li>
  <li><a href="https://www.postgresql.org/docs/current/sql-select.html#SQL-FOR-UPDATE-SHARE">SKIP LOCKED</a></li>
  <li><a href="https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-UNLOGGED">UNLOGGED 表</a></li>
</ul>

<p><strong>工具：</strong></p>

<ul>
  <li><a href="https://www.pgbouncer.org/">pgBouncer</a> - 连接池</li>
  <li><a href="https://www.postgresql.org/docs/current/pgstatstatements.html">pg_stat_statements</a> - 查询性能</li>
</ul>

<p><strong>替代解决方案：</strong></p>

<ul>
  <li><a href="https://github.com/graphile/worker">Graphile Worker</a> - 基于 Postgres 的任务队列</li>
  <li><a href="https://github.com/timgit/pg-boss">pg-boss</a> - 另一个 Postgres 队列</li>
</ul>

<hr />

<h3 id="tldr">TL;DR</h3>

<p><strong>我用 PostgreSQL 替换了 Redis，用于：</strong></p>

<ol>
  <li>缓存 → UNLOGGED 表</li>
  <li>发布/订阅 → LISTEN/NOTIFY</li>
  <li>任务队列 → SKIP LOCKED</li>
  <li>会话 → JSONB 表</li>
</ol>

<p><strong>结果：</strong></p>

<ul>
  <li>节省 $100/月</li>
  <li>减少运维复杂性</li>
  <li>稍慢（0.1-1ms）但可接受</li>
  <li>保证事务一致性</li>
</ul>

<p><strong>何时这样做：</strong></p>

<ul>
  <li>小型到中型应用</li>
  <li>简单的缓存需求</li>
  <li>想要减少移动部件</li>
</ul>

<p><strong>何时不这样做：</strong></p>

<ul>
  <li>高性能要求（100k+ 操作/秒）</li>
  <li>使用 Redis 特定功能</li>
  <li>有专门的运维团队</li>
</ul>

<hr />

<p><strong>你用 Postgres 替换了 Redis（或反之）吗？</strong> 你的经验是什么？在评论中分享你的基准测试！👇</p>

<p><em>P.S. - 想要后续的”PostgreSQL 隐藏功能”或”何时 Redis 实际上更好”吗？告诉我！</em></p>]]></content><author><name>gaoxingliang</name></author><category term="迁移自CSDN" /><category term="postgresql" /><category term="redis" /><category term="数据库" /><summary type="html"><![CDATA[我用PostgreSQL替换Redis并实现更快性能 摘要： 作者分享了将Redis功能迁移到PostgreSQL的经验，成功简化技术栈并降低成本。PostgreSQL通过UNLOGGED表实现高效缓存（0.08ms插入），使用LISTEN/NOTIFY进行发布/订阅（2-5ms延迟），以及利用SKIP LOCKED特性构建任务队列。虽然单个操作略慢于Redis，但整体性能更优，且消除了数据一致性问题。这一改变每月节省约100美元运维成本，并减少了系统复杂度。特别展示了实时日志流的实现方案，通过触发器自动发]]></summary></entry><entry><title type="html">《高性能mysql》读书笔记</title><link href="https://gaoxingliang.github.io/blog/2026/01/20/mysql-157177904/" rel="alternate" type="text/html" title="《高性能mysql》读书笔记" /><published>2026-01-20T07:27:01+00:00</published><updated>2026-01-20T07:27:01+00:00</updated><id>https://gaoxingliang.github.io/blog/2026/01/20/mysql-157177904</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2026/01/20/mysql-157177904/"><![CDATA[<h4 id="文章目录">文章目录</h4>

<ul>
  <li><a href="#__1">第三章 监控</a></li>
  <li>
    <ul>
      <li><a href="#_2">关于存储过程的监控：</a></li>
    </ul>
  </li>
  <li><a href="#__36">第七章 高性能索引</a></li>
  <li>
    <ul>
      <li><a href="#_37">关于前缀索引和基数</a></li>
      <li><a href="#explain_51">explain的输出</a></li>
      <li>
        <ul>
          <li><a href="#type_52">type:</a></li>
          <li><a href="#Extra_57">Extra:</a></li>
          <li><a href="#where_63">对于where的实现：</a></li>
          <li><a href="#_68">索引不被使用的情况：</a></li>
        </ul>
      </li>
    </ul>
  </li>
  <li><a href="#10_88">第10章备份与恢复</a></li>
  <li>
    <ul>
      <li><a href="#_mydumper____500GB__101">🔒 一、<code class="language-plaintext highlighter-rouge">mydumper</code> —— 安全逻辑备份（适用于 ≤ 500GB 库）</a></li>
      <li>
        <ul>
          <li><a href="#__103">✅ 备份策略</a></li>
          <li><a href="#_1__111">🛠 1. 创建备份专用账号（主库执行）</a></li>
          <li><a href="#_2_optscriptsmydumper_backupsh_123">📦 2. 安全备份脚本（<code class="language-plaintext highlighter-rouge">/opt/scripts/mydumper\_backup.sh</code>）</a></li>
          <li><a href="#_3__181">🔁 3. 安全恢复示例（到新实例）</a></li>
        </ul>
      </li>
      <li><a href="#_xtrabackup____100GB__204">🔒 二、<code class="language-plaintext highlighter-rouge">xtrabackup</code> —— 安全物理备份（适用于 ≥ 100GB 库）</a></li>
      <li>
        <ul>
          <li><a href="#__206">✅ 备份策略</a></li>
          <li><a href="#_1__214">🛠 1. 创建备份账号（主库执行）</a></li>
          <li><a href="#_2_optscriptsxtrabackup_fullsh_223">📦 2. 安全全量备份脚本（<code class="language-plaintext highlighter-rouge">/opt/scripts/xtrabackup\_full.sh</code>）</a></li>
          <li><a href="#_3__264">➕ 3. 增量备份脚本（每日）</a></li>
          <li><a href="#_4__283">🔁 4. 安全恢复流程（到新服务器）</a></li>
          <li>
            <ul>
              <li><a href="#_1_285">步骤 1：传输并解密全量</a></li>
              <li><a href="#_2_292">步骤 2：应用增量（如有）</a></li>
              <li><a href="#_3_prepare___299">步骤 3：最终 prepare + 启动</a></li>
              <li><a href="#_4PITR_307">步骤 4：PITR（如果需要）</a></li>
            </ul>
          </li>
        </ul>
      </li>
      <li><a href="#__318">🛡 三、生产环境安全加固清单</a></li>
      <li><a href="#__332">📊 四、如何选择？</a></li>
      <li><a href="#__343">✅ 总结</a></li>
    </ul>
  </li>
</ul>

<h2 id="第三章-监控">第三章 监控</h2>

<h3 id="关于存储过程的监控">关于存储过程的监控：</h3>

<p>示例存储过程：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DELIMITER $$

CREATE PROCEDURE SimpleSelectOne()
BEGIN
    SELECT 1 AS result;
    END$$

DELIMITER ;
</code></pre></div></div>

<p>只能看到某个存储过程中执行的sql查询，但是看不到是哪个存储过程：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>select * from performance_schema.events_statements_history where EVENT_NAME like 'statement/sp%' \G;
</code></pre></div></div>

<p><img src="/assets/images/posts/mysql-157177904/img-001.png" alt="在这里插入图片描述" /></p>

<p>查看存储过程：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT 
    EVENT_ID AS call_event_id,
    OBJECT_SCHEMA AS proc_schema,
    OBJECT_NAME AS proc_name,
    SQL_TEXT
FROM performance_schema.events_statements_history
WHERE SQL_TEXT LIKE 'CALL%';
</code></pre></div></div>

<p><img src="/assets/images/posts/mysql-157177904/img-002.png" alt="在这里插入图片描述" /><br />
 注意这里call_event_id和上面的nesting_event_id 可以串联/结合起来查看。</p>

<h2 id="第七章-高性能索引">第七章 高性能索引</h2>

<h3 id="关于前缀索引和基数">关于前缀索引和基数</h3>

<p>1、选择性=基数/总记录数， 基数（表列 不同值的个数）越接近表总数 选择性越高，索引越快。<br />
 2、可以通过列前缀索引在索引大小和查询速度上进行折中。 选择有一定区分度的列前缀即可。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>select count(distinct left(city, 3)) /  count(*) as sel3,
count(distinct left(city, 4)) /  count(*) as sel4,
count(distinct left(city, 5)) /  count(*) as sel5
from city
</code></pre></div></div>

<p>观察不同长度的前缀的选择性，选择合适的。建索引。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>alter table city add key (city(7))
</code></pre></div></div>

<h3 id="explain的输出">explain的输出</h3>

<h4 id="type">type:</h4>

<ul>
  <li>ref： 用了索引或者索引的前缀部分列</li>
  <li>all 全表扫描</li>
  <li>index 全索引扫描</li>
</ul>

<h4 id="extra">Extra:</h4>

<ul>
  <li>Using where： 额外过滤</li>
  <li>Using index： 用了覆盖索引，要查询的列都在索引里面</li>
  <li>Using filesort</li>
  <li>Use temporary</li>
</ul>

<h4 id="对于where的实现">对于where的实现：</h4>

<p>性能从高到底：</p>

<ul>
  <li>在索引中使用where条件过滤记录 【存储层完成】</li>
  <li>覆盖索引中（Extra列显示using index），读取索引记录后 在服务器端中过滤【服务器层完成】</li>
  <li>从表中返回记录（Extra列显示using where），在服务器层过滤。 最慢的.</li>
</ul>

<h4 id="索引不被使用的情况">索引不被使用的情况：</h4>

<p>1、索引未被使用：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>检查 name 条件是否使用了函数或类型转换，如 WHERE LOWER(name) = 'xxx' 或 WHERE name = 123（当 name 是字符串类型时）
检查是否使用了 !=、NOT IN 等无法使用索引的操作符
</code></pre></div></div>

<p>2、索引选择性太低：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>如果 name 的值非常集中（如90%的行都有相同的 name 值），优化器可能认为全表扫描比索引扫描更高效
</code></pre></div></div>

<p>3、统计信息不准确：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>MySQL 的优化器依赖统计信息做决策，如果统计信息过时，可能导致错误选择执行计划
</code></pre></div></div>

<p>4、索引字段类型不一致，比如一个是int 但是查询时候用的string</p>

<p>5、查询覆盖了太多数据：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>如果满足 name = xxx 条件的行数超过表的约30%，优化器可能选择全表扫描
</code></pre></div></div>

<h2 id="第10章备份与恢复">第10章备份与恢复</h2>

<p>常见和推荐的工具：</p>

<ul>
  <li>基于物理文件的备份和恢复：xtrackbackup</li>
  <li>基于逻辑的备份和恢复：mydumper</li>
</ul>

<p>在生产环境中，<strong>安全、可靠、可验证的备份恢复方案</strong>是数据库运维的生命线。<code class="language-plaintext highlighter-rouge">mydumper</code>（逻辑备份）和 <code class="language-plaintext highlighter-rouge">xtrabackup</code>（物理备份）是 MySQL 生态中最主流的两种工具，各有适用场景：</p>

<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">mydumper</code></strong>：适合<strong>中小库、跨版本迁移、部分表恢复</strong></li>
  <li><strong><code class="language-plaintext highlighter-rouge">xtrabackup</code></strong>：适合<strong>大库、秒级恢复、PITR（时间点恢复）</strong></li>
</ul>

<p>下面分别给出 <strong>生产级安全备份与恢复示例</strong>，包含权限控制、加密、校验、监控等关键要素。</p>

<h3 id="-一mydumper--安全逻辑备份适用于--500gb-库">🔒 一、<code class="language-plaintext highlighter-rouge">mydumper</code> —— 安全逻辑备份（适用于 ≤ 500GB 库）</h3>

<h4 id="-备份策略">✅ 备份策略</h4>

<ul>
  <li><strong>每日全量 + 增量 binlog</strong></li>
  <li><strong>压缩 + 加密</strong></li>
  <li><strong>保留 7 天</strong></li>
  <li><strong>专用备份账号（最小权限）</strong></li>
</ul>

<hr />

<h4 id="-1-创建备份专用账号主库执行">🛠 1. 创建备份专用账号（主库执行）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 最小权限原则
CREATE USER 'backup'@'%' IDENTIFIED BY 'StrongPass!2026';
GRANT SELECT, RELOAD, SHOW DATABASES, LOCK TABLES, PROCESS ON *.* TO 'backup'@'%';
FLUSH PRIVILEGES;
</code></pre></div></div>

<blockquote>
  <p>⚠️ <strong>禁止授予 SUPER 权限！</strong></p>
</blockquote>

<hr />

<h4 id="-2-安全备份脚本optscriptsmydumper_backupsh">📦 2. 安全备份脚本（<code class="language-plaintext highlighter-rouge">/opt/scripts/mydumper_backup.sh</code>）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash
# 安全 mydumper 备份脚本 - 生产环境

set -euo pipefail

BACKUP_DIR="/backup/mysql/mydumper"
DATE=$(date +%Y%m%d_%H%M)
LOG_FILE="/var/log/mydumper_backup.log"
MYSQL_HOST="127.0.0.1"
MYSQL_USER="backup"
MYSQL_PASS="StrongPass!2026"
ENCRYPTION_KEY="/etc/mysql/backup.key"  # AES-256 密钥文件

# 创建目录
mkdir -p ${BACKUP_DIR}/${DATE}

# 记录开始时间
echo "[$(date)] Starting mydumper backup..." &gt;&gt; $LOG_FILE

# 执行备份（压缩 + 加密 + 并行）
mydumper \
  --host=${MYSQL_HOST} \
  --user=${MYSQL_USER} \
  --password=${MYSQL_PASS} \
  --outputdir=${BACKUP_DIR}/${DATE} \
  --compress=gzip \          # 压缩节省空间
  --encrypt=AES256 \         # 加密备份文件
  --encrypt-key-file=${ENCRYPTION_KEY} \
  --threads=8 \              # 根据 CPU 调整
  --trx-consistency-only \   # 仅保证事务一致性（不锁表）
  --verbose=3 \
  &gt;&gt; $LOG_FILE 2&gt;&amp;1

# 验证备份完整性（检查 metadata 文件）
if [ ! -f "${BACKUP_DIR}/${DATE}/metadata" ]; then
  echo "[$(date)] ERROR: Backup failed - metadata missing!" &gt;&gt; $LOG_FILE
  exit 1
fi

# 清理 7 天前备份
find ${BACKUP_DIR} -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;

echo "[$(date)] Backup completed successfully." &gt;&gt; $LOG_FILE
</code></pre></div></div>

<blockquote>
  <p>🔑 <strong>加密密钥管理</strong>：</p>

  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 生成 256 位 AES 密钥（仅 root 可读）
openssl rand -base64 32 &gt; /etc/mysql/backup.key
chmod 600 /etc/mysql/backup.key
chown root:root /etc/mysql/backup.key
</code></pre></div>  </div>
</blockquote>

<hr />

<h4 id="-3-安全恢复示例到新实例">🔁 3. 安全恢复示例（到新实例）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 解密并恢复
myloader \
  --host=127.0.0.1 \
  --user=restore_user \
  --password='RestorePass!2026' \
  --directory=/backup/mysql/mydumper/20260122_1400 \
  --decrypt=AES256 \
  --decrypt-key-file=/etc/mysql/backup.key \
  --threads=8 \
  --overwrite-tables \
  --verbose=3
</code></pre></div></div>

<blockquote>
  <p>✅ <strong>恢复前必做</strong>：</p>

  <ol>
    <li>在<strong>隔离环境</strong>测试恢复</li>
    <li>检查 <code class="language-plaintext highlighter-rouge">SHOW TABLES;</code> 和 <code class="language-plaintext highlighter-rouge">SELECT COUNT(*)</code> 验证数据量</li>
    <li><strong>不要直接恢复到生产主库！</strong></li>
  </ol>
</blockquote>

<hr />

<h3 id="-二xtrabackup--安全物理备份适用于--100gb-库">🔒 二、<code class="language-plaintext highlighter-rouge">xtrabackup</code> —— 安全物理备份（适用于 ≥ 100GB 库）</h3>

<h4 id="-备份策略-1">✅ 备份策略</h4>

<ul>
  <li><strong>每周日全量 + 每日增量</strong></li>
  <li><strong>流式压缩 + 加密</strong></li>
  <li><strong>保留 4 周</strong></li>
  <li><strong>支持 PITR（基于 binlog）</strong></li>
</ul>

<hr />

<h4 id="-1-创建备份账号主库执行">🛠 1. 创建备份账号（主库执行）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE USER 'xtrabackup'@'localhost' IDENTIFIED BY 'XbkPass!2026';
GRANT RELOAD, PROCESS, LOCK TABLES, REPLICATION CLIENT, SHOW DATABASES ON *.* TO 'xtrabackup'@'localhost';
FLUSH PRIVILEGES;
</code></pre></div></div>

<hr />

<h4 id="-2-安全全量备份脚本optscriptsxtrabackup_fullsh">📦 2. 安全全量备份脚本（<code class="language-plaintext highlighter-rouge">/opt/scripts/xtrabackup_full.sh</code>）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash
# xtrabackup 全量备份 - 生产安全版

set -euo pipefail

BACKUP_BASE="/backup/mysql/xtrabackup"
DATE=$(date +%Y%m%d)
FULL_BACKUP_DIR="${BACKUP_BASE}/full_${DATE}"
LOG_FILE="/var/log/xtrabackup_full.log"
ENCRYPTION_KEY="/etc/mysql/xbk.key"

mkdir -p $FULL_BACKUP_DIR

# 流式备份到 xbstream + 压缩 + 加密
xtrabackup \
  --user=xtrabackup \
  --password=XbkPass!2026 \
  --backup \
  --target-dir=$FULL_BACKUP_DIR \
  --stream=xbstream \
  --compress=zstd \          # zstd 比 gzip 更快
  --compress-threads=4 \
  --encrypt=AES256 \
  --encrypt-key-file=$ENCRYPTION_KEY \
  --encrypt-threads=4 \
  | ssh backup-server "cat &gt; ${FULL_BACKUP_DIR}/full.xbstream.zst.enc"

# 记录 binlog 位置（用于 PITR）
ssh backup-server "xtrabackup --decrypt=AES256 --encrypt-key-file=$ENCRYPTION_KEY --target-dir=$FULL_BACKUP_DIR &amp;&amp; xtrabackup --decompress --target-dir=$FULL_BACKUP_DIR &amp;&amp; xtrabackup --prepare --target-dir=$FULL_BACKUP_DIR"

echo "Full backup completed: $DATE" &gt;&gt; $LOG_FILE
</code></pre></div></div>

<blockquote>
  <p>💡 <strong>为什么用 <code class="language-plaintext highlighter-rouge">--stream</code>？</strong><br />
 避免本地磁盘写满，直接流到备份服务器。</p>
</blockquote>

<hr />

<h4 id="-3-增量备份脚本每日">➕ 3. 增量备份脚本（每日）</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 基于上周日全量做增量
xtrabackup \
  --user=xtrabackup \
  --password=XbkPass!2026 \
  --backup \
  --target-dir=/tmp/inc_$(date +%Y%m%d) \
  --incremental-basedir=/backup/mysql/xtrabackup/full_20260119 \
  --stream=xbstream \
  --compress=zstd \
  --encrypt=AES256 \
  --encrypt-key-file=/etc/mysql/xbk.key \
  | ssh backup-server "cat &gt; /backup/mysql/xtrabackup/inc_$(date +%Y%m%d).xbstream.zst.enc"
</code></pre></div></div>

<hr />

<h4 id="-4-安全恢复流程到新服务器">🔁 4. 安全恢复流程（到新服务器）</h4>

<h5 id="步骤-1传输并解密全量">步骤 1：传输并解密全量</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scp backup-server:/backup/mysql/xtrabackup/full_20260119.xbstream.zst.enc /restore/
xtrabackup --decrypt=AES256 --encrypt-key-file=/etc/mysql/xbk.key --target-dir=/restore/full
xtrabackup --decompress --target-dir=/restore/full
</code></pre></div></div>

<h5 id="步骤-2应用增量如有">步骤 2：应用增量（如有）</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xtrabackup --decrypt=... --decompress=... --target-dir=/restore/inc_20260120
xtrabackup --prepare --apply-log-only --target-dir=/restore/full
xtrabackup --prepare --target-dir=/restore/full --incremental-dir=/restore/inc_20260120
</code></pre></div></div>

<h5 id="步骤-3最终-prepare--启动">步骤 3：最终 prepare + 启动</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xtrabackup --prepare --target-dir=/restore/full
rsync -avrP /restore/full/ /var/lib/mysql/
chown -R mysql:mysql /var/lib/mysql
systemctl start mysqld
</code></pre></div></div>

<h5 id="步骤-4pitr如果需要">步骤 4：PITR（如果需要）</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 查看备份的 binlog 位置
cat /restore/full/xtrabackup_binlog_info

-- 用 mysqlbinlog 恢复到指定时间点
mysqlbinlog --start-position=12345 --stop-datetime="2026-01-22 14:00:00" binlog.000001 | mysql -u root -p
</code></pre></div></div>

<hr />

<h3 id="-三生产环境安全加固清单">🛡 三、生产环境安全加固清单</h3>

<table>
  <thead>
    <tr>
      <th>项目</th>
      <th>mydumper</th>
      <th>xtrabackup</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>最小权限账号</strong></td>
      <td>✅</td>
      <td>✅</td>
    </tr>
    <tr>
      <td><strong>传输加密</strong></td>
      <td>SSH / TLS</td>
      <td>SSH / TLS</td>
    </tr>
    <tr>
      <td><strong>存储加密</strong></td>
      <td>AES256</td>
      <td>AES256</td>
    </tr>
    <tr>
      <td><strong>完整性校验</strong></td>
      <td><code class="language-plaintext highlighter-rouge">metadata</code> 文件</td>
      <td><code class="language-plaintext highlighter-rouge">xtrabackup_checkpoints</code></td>
    </tr>
    <tr>
      <td><strong>恢复演练</strong></td>
      <td>每月一次</td>
      <td>每季度一次</td>
    </tr>
    <tr>
      <td><strong>监控告警</strong></td>
      <td>备份大小突降、失败日志</td>
      <td>同左 + prepare 失败</td>
    </tr>
    <tr>
      <td><strong>保留策略</strong></td>
      <td>7天全量</td>
      <td>4周（全量+增量）</td>
    </tr>
  </tbody>
</table>

<hr />

<h3 id="-四如何选择">📊 四、如何选择？</h3>

<table>
  <thead>
    <tr>
      <th>场景</th>
      <th>推荐工具</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>库 500GB，要求 RTO &lt; 30min</td>
      <td>✅ <strong>xtrabackup</strong></td>
    </tr>
    <tr>
      <td>需恢复单表</td>
      <td>✅ mydumper（<code class="language-plaintext highlighter-rouge">.sql</code> 文件可编辑）</td>
    </tr>
    <tr>
      <td>需 PITR（时间点恢复）</td>
      <td>✅ xtrabackup + binlog</td>
    </tr>
    <tr>
      <td>云环境（RDS）</td>
      <td>❌ 两者均不可用 → 用云厂商快照</td>
    </tr>
  </tbody>
</table>

<hr />

<h3 id="-总结">✅ 总结</h3>

<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">mydumper</code></strong>：逻辑备份，<strong>灵活但慢</strong>，适合中小库</li>
  <li><strong><code class="language-plaintext highlighter-rouge">xtrabackup</code></strong>：物理备份，<strong>极速恢复</strong>，适合大库</li>
  <li><strong>共同原则</strong>：<br />
 🔐 加密（传输+存储）<br />
 👮 最小权限<br />
 ✅ 定期恢复演练<br />
 📉 监控备份大小/耗时异常</li>
</ul>

<blockquote>
  <p>💡 <strong>终极建议</strong>：<br />
 <strong>同时使用两者</strong>——</p>

  <ul>
    <li><code class="language-plaintext highlighter-rouge">xtrabackup</code> 做主力（快速恢复）</li>
    <li><code class="language-plaintext highlighter-rouge">mydumper</code> 做辅助（单表恢复、跨环境迁移）</li>
  </ul>
</blockquote>]]></content><author><name>gaoxingliang</name></author><category term="迁移自CSDN" /><category term="mysql" /><category term="数据库" /><summary type="html"><![CDATA[摘要：本文介绍了MySQL存储过程的监控方法。通过示例展示了创建简单存储过程SimpleSelectOne，演示了如何查询performance_schema.events_statements_history表来监控存储过程中的SQL执行情况。重点说明了仅能看到执行的SQL而无法直接识别所属存储过程的问题，并提供了通过关联call_event_id和nesting_event_id字段来追踪存储过程调用的解决方案。文中包含SQL查询示例和截图说明，帮助理解存储过程监控的实际操作流程。]]></summary></entry><entry><title type="html">mysql federatedengine 使用</title><link href="https://gaoxingliang.github.io/blog/2026/01/19/mysql-federatedengine-157138288/" rel="alternate" type="text/html" title="mysql federatedengine 使用" /><published>2026-01-19T08:53:27+00:00</published><updated>2026-01-19T08:53:27+00:00</updated><id>https://gaoxingliang.github.io/blog/2026/01/19/mysql-federatedengine-157138288</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2026/01/19/mysql-federatedengine-157138288/"><![CDATA[<h2 id="abstract">abstract</h2>

<p>在：<a href="https://blog.csdn.net/scugxl/article/details/149571144?spm=1001.2014.3001.5501">记录某大型风控系统调研中踩坑</a>提到了采用了mysql的federated engine来实现贴源层相关数据的导入。 最近就发现了 mysql时不时因为oom kill被杀掉了，在16g的内存机器上使用15G左右：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo dmesg -T | grep -i "killed process"
</code></pre></div></div>

<p><img src="/assets/images/posts/mysql-federatedengine-157138288/img-001.png" alt="在这里插入图片描述" /></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- query time
show status

-- MySQL 运行状态与内存概览
SELECT 
  -- Uptime
  CONCAT(
    FLOOR(VARIABLE_VALUE / 86400), 'd ',
    FLOOR((VARIABLE_VALUE % 86400) / 3600), 'h ',
    FLOOR((VARIABLE_VALUE % 3600) / 60), 'm'
  ) AS uptime,
  
  -- 内存配置
  FORMAT(@@innodb_buffer_pool_size / 1024 / 1024, 2) AS ibp_mb,
  FORMAT(@@key_buffer_size / 1024 / 1024, 2) AS key_buffer_mb,
  
  -- 连接数
  @@max_connections AS max_conn,
  VARIABLE_VALUE AS current_conn,
  
  -- 估算峰值内存 (MB)
  FORMAT((
    @@innodb_buffer_pool_size + 
    @@key_buffer_size + 
    (@@sort_buffer_size + @@read_buffer_size + @@join_buffer_size) * @@max_connections
  ) / 1024 / 1024, 2) AS est_peak_memory_mb

FROM information_schema.GLOBAL_STATUS 
WHERE VARIABLE_NAME IN ('Uptime', 'Threads_connected')
LIMIT 1;
</code></pre></div></div>

<h2 id="调查过程">调查过程</h2>

<h3 id="查看日志">查看日志</h3>

<p>查看日志发现是在执行存储过程当中会失败，然后查看存储过程会生成类似的sql代码：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>INSERT INTO t_dp_i_import_collection_plan(xxx)   --- 隐藏了
SELECT xxx   --- 隐藏了
FROM import_collection_plan as t
where report_date = '2025-01-31'
</code></pre></div></div>

<p>查看表大小：【估算】</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT
    TABLE_NAME AS `Table`,
    ENGINE AS `Engine`,
    ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024, 2) AS `Size_MB`,
    ROUND(DATA_LENGTH / 1024 / 1024, 2) AS `Data_MB`,
    ROUND(INDEX_LENGTH / 1024 / 1024, 2) AS `Index_MB`,
    TABLE_ROWS AS `Est_Row_Count`
FROM
    information_schema.TABLES
WHERE
    TABLE_SCHEMA = 'xxx'  -- 👈 替换为你的数据库名
ORDER BY TABLE_ROWS DESC
</code></pre></div></div>

<p>然后发现：<code class="language-plaintext highlighter-rouge">import_collection_plan</code> 大概800w行，定义使用了federatedengine怀疑是这个导致的。<br />
 <img src="/assets/images/posts/mysql-federatedengine-157138288/img-002.jpeg" alt="在这里插入图片描述" /></p>

<h3 id="mysql的联邦表">mysql的联邦表</h3>

<p><a href="https://dev.mysql.com/doc/refman/8.0/en/federated-storage-engine.html">mysql官方文档</a><br />
 <img src="/assets/images/posts/mysql-federatedengine-157138288/img-003.png" alt="在这里插入图片描述" /><br />
 有一段关键的描述：</p>

<blockquote>
  <p>A FEDERATED table does not support indexes in the usual sense; because access to the table data is handled remotely, it is actually the remote table that makes use of indexes. This means that, for a query that cannot use any indexes and so requires a full table scan, the server fetches all rows from the remote table and filters them locally. This occurs regardless of any WHERE or LIMIT used with this SELECT statement; these clauses are applied locally to the returned rows.<br />
 Queries that fail to use indexes can thus cause poor performance and network overload. In addition, since returned rows must be stored in memory, such a query can also lead to the local server swapping, or even hanging.</p>
</blockquote>

<p>可以看到他并不会进行索引或者条件的下推。 这个对于大表是非常危险的，而且会占用很大的内存进行本地的排序过滤。</p>

<h3 id="我的测试">我的测试</h3>

<p>我自行在mac和win组成的局域网进行了测试，mac上存原始数据，数据500W，在win上创建的mysql服务器上创建联邦表:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE `import_collection_plan` (
                                          `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '主键ID',
                                          `process_time` char(19) COLLATE utf8_bin DEFAULT NULL COMMENT '程序执行时间',
                                          `serial_no` varchar(32) COLLATE utf8_bin DEFAULT NULL COMMENT '序号',
                                          `enterprise_name` varchar(50) COLLATE utf8_bin DEFAULT NULL COMMENT '填表企业',
                                          `project_no` varchar(100) COLLATE utf8_bin DEFAULT NULL COMMENT '项目编号',
                                          `receipt_no` varchar(50) COLLATE utf8_bin DEFAULT NULL COMMENT '借据编号',
                                          `repayment_period_no` varchar(32) COLLATE utf8_bin DEFAULT NULL COMMENT '还款期次',
                                          `plan_repayment_date` char(10) COLLATE utf8_bin DEFAULT NULL COMMENT '计划还款日期',
                                          `plan_repayment_principal_amt` decimal(18,2) DEFAULT NULL COMMENT '计划还款本金',
                                          `plan_repayment_interest_amt` decimal(18,2) DEFAULT NULL COMMENT '计划还款利息',
                                          `plan_repayment_other_amt` decimal(18,2) DEFAULT NULL COMMENT '计划还款其他金额',
                                          `repayment_status_name` varchar(50) COLLATE utf8_bin DEFAULT NULL COMMENT '还款状态',
                                          `last_recover_date` char(10) COLLATE utf8_bin DEFAULT NULL COMMENT '最后回收时间',
                                          `last_recover_name` varchar(50) COLLATE utf8_bin DEFAULT NULL COMMENT '最后回收人',
                                          `report_date` char(10) COLLATE utf8_bin DEFAULT NULL COMMENT '报送时间',
                                          PRIMARY KEY (`id`),
                                          KEY `enterprise_name` (`enterprise_name`,`report_date`,`process_time`)
) ENGINE=FEDERATED DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='导入收款计划表' CONNECTION='mysql://root:root@192.168.110.164:3106/xxx/import_collection_plan';
</code></pre></div></div>

<p>后测试sql:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>create table d select * from import_collection_plan
where report_date = '2023-02-21';
</code></pre></div></div>

<p>在mac端，该表在report_date上有索引。 mac上执行上面语句约：0.6秒完成。<br />
 win上执行10分钟，且mysql内存从150M涨到1.6G，网络跑满：<br />
 <img src="/assets/images/posts/mysql-federatedengine-157138288/img-004.png" alt="在这里插入图片描述" /></p>

<h3 id="查询验证">查询验证</h3>

<p>打开源端的genral log:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set global general_log = ON

-- 输出到文件（默认）
SET GLOBAL log_output = 'FILE';

-- 或输出到 mysql.general_log 表（方便 SQL 查询）
SET GLOBAL log_output = 'TABLE';
</code></pre></div></div>

<p>在本地执行查询：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>INSERT INTO d select * from import_collection_plan
where report_date = '2023-02-18';
</code></pre></div></div>

<p>在源端查看general log: 可以看到where条件并没有被下推：<br />
 <img src="/assets/images/posts/mysql-federatedengine-157138288/img-005.png" alt="在这里插入图片描述" /></p>

<p><img src="/assets/images/posts/mysql-federatedengine-157138288/img-006.png" alt="在这里插入图片描述" /></p>

<h2 id="怎么解决">怎么解决</h2>

<h3 id="方案1-源端服务器创建视图">方案1 源端服务器创建视图</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 在 192.168.110.164 上执行
CREATE TABLE import_collection_plan_20230221 AS
SELECT * FROM import_collection_plan 
WHERE report_date = '2023-02-21';
</code></pre></div></div>

<p>如果该时间有变（我们场景，需要考虑周期性cron创建）</p>

<h3 id="方案2-应用层同步">方案2 应用层同步</h3>

<p>应用层查询数据后同步而不依赖federated engine.</p>

<h3 id="方案3-cdc捕获关注表-同步到本地后查询">方案3 CDC捕获关注表 同步到本地后查询</h3>

<h2 id="更多的测试和调查">更多的测试和调查</h2>

<p>我又尝试多问了下AI，这个到底要怎么解决，AI给我的答复是：<br />
 <img src="/assets/images/posts/mysql-federatedengine-157138288/img-007.png" alt="在这里插入图片描述" />可以看到这个<code class="language-plaintext highlighter-rouge">where clause pushdown</code> 是 <code class="language-plaintext highlighter-rouge">limited</code>, 这个就比较奇怪了，所以我做了更多的测试和查看<a href="https://github.com/mysql/mysql-server/blob/trunk/storage/federated/ha_federated.cc">mysql源码</a>：<br />
 Case 1: 源表有索引， 本地表没索引， 无法使用下推。会生成全表sql：<code class="language-plaintext highlighter-rouge">select col1, col2 from tableA</code>。<br />
 Case 2: 源表无索引， 本地表有索引， 本地使用下推 会生成sql：<code class="language-plaintext highlighter-rouge">select col1, col2 from tableA where indexCol = 'x'</code> ，但是源表 因为没索引会全表扫描。<br />
 Case 3: 源表有索引， 本地表有索引， <strong>可以</strong>使用等值下推和&lt; &gt; 这种也行。</p>

<p>所以说最终的解决办法是源表和本地表都要加上索引才行。<br />
 注： federated engine无法在线加索引，需要重新创建并添加。</p>

<h2 id="总结">总结</h2>

<p>本文对mysql federated engine 做了很多测试和研究，结论如下：<br />
 1，对生成发给源端的sql取决于：本地表定义。<br />
 2，对发给源端的sql在源端执行时，取决于源端自身的优化器和执行器。<br />
 3，建议2个表有同样的索引来激活where下推和索引下推。如果源表没有索引，源表会全表扫描。 如果本地表没有索引，发送给源表的就是一个不带where语句的查询，然后在本地进行过滤，会导致全表数据的网络发送和本地内存的大量使用。<br />
 4, 注意在 源表无索引，但是本地表有索引的情况，可能出现 read超时，因为对端全表扫描还没准备好数据。 （但是：如果两边都没有索引的情况 反而不会超时。因为一直有数据发送。<strong>这个情况比较有意思。</strong>）</p>]]></content><author><name>gaoxingliang</name></author><category term="迁移自CSDN" /><category term="mysql" /><summary type="html"><![CDATA[MySQL联邦表(FEDERATED引擎)存在严重性能问题：当查询无法使用索引时，会从远程表获取所有数据在本地进行过滤，导致内存暴增和网络过载。测试显示，500万行数据在源端执行仅需0.6秒，而通过联邦表查询耗时10分钟且内存从150MB飙升至1.6GB。问题根源在于联邦表不会下推WHERE条件到远程服务器，而是将所有数据拉取到本地处理。这解释了生产环境中MySQL因OOM被kill的现象。建议避免对大表使用联邦表，或确保查询能利用远程表索引。]]></summary></entry><entry><title type="html">虎嗅24小时屏蔽机器人评论的油猴脚本</title><link href="https://gaoxingliang.github.io/blog/2026/01/15/24-156980598/" rel="alternate" type="text/html" title="虎嗅24小时屏蔽机器人评论的油猴脚本" /><published>2026-01-15T02:19:16+00:00</published><updated>2026-01-15T02:19:16+00:00</updated><id>https://gaoxingliang.github.io/blog/2026/01/15/24-156980598</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2026/01/15/24-156980598/"><![CDATA[<p>自用：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// ==UserScript==
// @name         虎嗅评论过滤 - 屏蔽评论数超过100页的用户
// @namespace    http://tampermonkey.net/
// @version      1.0.1
// @description  自动检测并屏蔽评论数超过100页的用户评论
// @author       You
// @match        https://www.huxiu.com/moment/*
// @exclude      https://www.huxiu.com/member/*
// @grant        GM_xmlhttpRequest
// @connect      api-web-account.huxiu.com
// ==/UserScript==

(function() {
    'use strict';

    // 配置
    const MAX_PAGES = 100; // 最大允许的评论页数
    const API_URL = 'https://api-web-account.huxiu.com/web/comment/commentList';
    const CHECK_INTERVAL = 2000; // 检查新评论的间隔（毫秒）

    // 存储正在检查中的用户ID，避免重复请求
    const pendingChecks = new Set();

    /**
     * 从评论元素中提取用户ID
     */
    function extractUserId(commentElement) {
        // 尝试多种方式提取用户ID
        // 方式1: 从虎嗅会员链接中提取（最常见，格式：/member/2374684.html）
        const userLinks = commentElement.querySelectorAll('a[href*="/member/"]');
        for (const link of userLinks) {
            const href = link.getAttribute('href');
            // 匹配 /member/123456.html 或 /member/123456 格式
            let match = href.match(/\/member\/(\d+)(?:\.html)?/);
            if (match) return match[1];
        }

        // 方式2: 从其他用户链接格式中提取
        const otherLinks = commentElement.querySelectorAll('a[href*="/user/"], a[href*="uid="]');
        for (const link of otherLinks) {
            const href = link.getAttribute('href');
            // 匹配 /user/123456 格式
            let match = href.match(/\/user\/(\d+)/);
            if (match) return match[1];
            // 匹配 ?uid=123456 格式
            match = href.match(/[?&amp;]uid=(\d+)/);
            if (match) return match[1];
        }

        // 方式2: 从data属性中提取
        let element = commentElement;
        for (let i = 0; i &lt; 10 &amp;&amp; element; i++) {
            const dataUid = element.getAttribute('data-uid') || 
                           element.getAttribute('data-user-id') ||
                           element.getAttribute('uid');
            if (dataUid &amp;&amp; /^\d+$/.test(dataUid)) {
                return dataUid;
            }
            element = element.parentElement;
        }

        // 方式3: 从class或id中提取
        element = commentElement;
        for (let i = 0; i &lt; 5 &amp;&amp; element; i++) {
            const uidMatch = element.className?.match(/uid[_-]?(\d+)|user[_-]?(\d+)/i) ||
                            element.id?.match(/uid[_-]?(\d+)|user[_-]?(\d+)/i);
            if (uidMatch) {
                return uidMatch[1] || uidMatch[2];
            }
            element = element.parentElement;
        }

        // 方式4: 从图片src或其他属性中提取
        const img = commentElement.querySelector('img[src*="user"], img[src*="avatar"]');
        if (img) {
            const src = img.getAttribute('src');
            const match = src?.match(/user[\/_-]?(\d+)/i);
            if (match) return match[1];
        }

        // 调试：输出元素信息
        console.warn('无法提取用户ID，元素信息:', {
            className: commentElement.className,
            id: commentElement.id,
            innerHTML: commentElement.innerHTML.substring(0, 200)
        });

        return null;
    }

    /**
     * 获取用户评论总数
     */
    function getUserCommentPages(uid) {
        return new Promise((resolve, reject) =&gt; {
            // 如果正在检查中，等待
            if (pendingChecks.has(uid)) {
                setTimeout(() =&gt; {
                    getUserCommentPages(uid).then(resolve).catch(reject);
                }, 500);
                return;
            }

            pendingChecks.add(uid);

            GM_xmlhttpRequest({
                method: 'POST',
                url: API_URL,
                headers: {
                    'Content-Type': 'application/x-www-form-urlencoded',
                    'Accept': 'application/json',
                    'Referer': 'https://www.huxiu.com/',
                    'Origin': 'https://www.huxiu.com'
                },
                data: `platform=www&amp;page=1&amp;uid=${uid}`,
                onload: function(response) {
                    pendingChecks.delete(uid);
                    try {
                        const data = JSON.parse(response.responseText);
                        
                        // 调试：输出API响应结构
                        console.log(`[API] 用户 ${uid} 的API响应:`, JSON.stringify(data, null, 2));
                        
                        // 尝试多种可能的响应格式
                        let totalPages = 0;
                        
                        if (data &amp;&amp; data.data) {
                            // 格式1: 虎嗅API标准格式 { data: { total_page: xxx } }
                            if (data.data.total_page !== undefined &amp;&amp; data.data.total_page !== null) {
                                totalPages = parseInt(data.data.total_page);
                                console.log(`[API] 从 data.data.total_page 获取页数: ${totalPages}`);
                            } else {
                                console.warn(`[API] 用户 ${uid} 的响应中未找到 total_page 字段，data.data 内容:`, data.data);
                            }
                        } else {
                            console.warn(`[API] 用户 ${uid} 的响应格式异常，data 或 data.data 不存在:`, data);
                        }
                        
                        if (totalPages === 0) {
                            console.warn(`[API] 用户 ${uid} 的页数解析为0，可能解析失败`);
                        }

                        resolve(totalPages);
                    } catch (e) {
                        console.error('解析API响应失败:', e, response.responseText);
                        reject(e);
                    }
                },
                onerror: function(error) {
                    pendingChecks.delete(uid);
                    console.error('API请求失败:', error);
                    reject(error);
                }
            });
        });
    }

    /**
     * 隐藏评论元素
     * @param {HTMLElement} commentElement - 评论元素（单个评论项）
     * @param {string} uid - 用户ID
     * @param {number} totalPages - 总评论页数
     */
    function hideComment(commentElement, uid, totalPages) {
        // 确保只隐藏单个评论项，而不是整个列表
        // 检查是否是评论列表容器
        if (commentElement.classList.contains('moment-comment__list')) {
            console.warn(`警告：尝试隐藏评论列表容器，跳过。用户ID: ${uid}`);
            return;
        }
        
        // 只隐藏单个评论项
        commentElement.style.display = 'none';
        commentElement.setAttribute('data-filtered', 'true');
        
        // 在控制台输出屏蔽信息
        console.log(`🚫 已屏蔽用户评论 | 用户ID: ${uid} | 总评论页数: ${totalPages}页`);
        
        // 添加一个简单的提示标记，显示评论已被隐藏
        const marker = document.createElement('div');
        marker.style.cssText = 'padding: 8px 12px; background: #f5f5f5; color: #999; font-size: 12px; margin-bottom: 10px; border-left: 3px solid #ddd; border-radius: 2px;';
        marker.textContent = '该评论已隐藏';
        marker.setAttribute('data-filter-marker', 'true');
        
        // 插入到评论项的父容器中，替换被隐藏的评论项位置
        if (commentElement.parentNode) {
            commentElement.parentNode.insertBefore(marker, commentElement);
        }
    }

    /**
     * 检查并过滤单个评论
     */
    async function checkAndFilterComment(commentElement) {
        // 如果已经处理过，跳过
        if (commentElement.getAttribute('data-checked') === 'true' ||
            commentElement.getAttribute('data-filtered') === 'true') {
            return;
        }

        // 安全检查：确保是单个评论项，而不是评论列表容器
        if (commentElement.classList.contains('moment-comment__list')) {
            console.warn('跳过评论列表容器，只处理单个评论项');
            return;
        }

        const uid = extractUserId(commentElement);
        if (!uid) {
            console.warn('无法提取用户ID:', commentElement);
            return;
        }

        // 标记为已检查
        commentElement.setAttribute('data-checked', 'true');

        try {
            const totalPages = await getUserCommentPages(uid);
            console.log(`用户 ${uid} 的评论页数: ${totalPages}`);

            if (totalPages &gt; MAX_PAGES) {
                hideComment(commentElement, uid, totalPages);
            }
        } catch (error) {
            console.error(`检查用户 ${uid} 失败:`, error);
            // 出错时不隐藏，避免误杀
        }
    }

    /**
     * 查找页面上的所有评论元素
     */
    function findAllComments() {
        // 根据虎嗅网站的实际结构，只选择单个评论项
        // 优先使用最精确的选择器，避免选择到评论列表容器
        const selectors = [
            '.comment-item', // 虎嗅单个评论项的标准选择器
            '[data-comment-id]', // 通过data-comment-id属性的单个评论项
        ];

        const comments = new Set();
        
        for (const selector of selectors) {
            try {
                const elements = document.querySelectorAll(selector);
                elements.forEach(el =&gt; {
                    // 确保不是已经过滤的元素，且有实际内容
                    // 排除评论列表容器（.moment-comment__list）
                    if (el.getAttribute('data-filtered') !== 'true' &amp;&amp;
                        !el.classList.contains('moment-comment__list') &amp;&amp; // 排除列表容器
                        el.offsetHeight &gt; 0 &amp;&amp; // 确保元素可见
                        el.textContent.trim().length &gt; 0) { // 确保有内容
                        comments.add(el);
                    }
                });
            } catch (e) {
                // 忽略无效选择器
            }
        }

        // 去重：如果元素A包含元素B，只保留最内层的元素（单个评论项）
        const filtered = Array.from(comments).filter(comment =&gt; {
            // 如果这个元素包含其他评论元素，说明它是容器，应该排除
            const hasChildComment = Array.from(comments).some(other =&gt; 
                other !== comment &amp;&amp; comment.contains(other)
            );
            // 如果这个元素被其他评论元素包含，保留它（它是单个评论项）
            const isChildOfComment = Array.from(comments).some(other =&gt; 
                other !== comment &amp;&amp; other.contains(comment)
            );
            // 保留：要么是单个评论项（被其他元素包含），要么是独立的评论项（不包含其他评论）
            return !hasChildComment || isChildOfComment;
        });

        return filtered;
    }

    /**
     * 批量检查评论
     */
    async function checkAllComments() {
        const comments = findAllComments();
        console.log(`找到 ${comments.length} 条评论，开始检查...`);

        // 批量处理，避免同时发起太多请求
        const batchSize = 5;
        for (let i = 0; i &lt; comments.length; i += batchSize) {
            const batch = comments.slice(i, i + batchSize);
            await Promise.all(batch.map(comment =&gt; checkAndFilterComment(comment)));
            
            // 批次之间稍作延迟
            if (i + batchSize &lt; comments.length) {
                await new Promise(resolve =&gt; setTimeout(resolve, 500));
            }
        }
    }

    /**
     * 监听DOM变化，处理动态加载的评论
     */
    function setupMutationObserver() {
        const observer = new MutationObserver((mutations) =&gt; {
            let shouldCheck = false;
            
            mutations.forEach((mutation) =&gt; {
                mutation.addedNodes.forEach((node) =&gt; {
                    if (node.nodeType === 1) { // Element node
                        // 检查是否是评论相关的元素
                        if (node.classList &amp;&amp; (
                            node.classList.toString().includes('comment') ||
                            node.querySelector &amp;&amp; node.querySelector('[class*="comment"]')
                        )) {
                            shouldCheck = true;
                        }
                    }
                });
            });

            if (shouldCheck) {
                // 延迟检查，等待DOM完全渲染
                setTimeout(() =&gt; {
                    checkAllComments();
                }, 1000);
            }
        });

        observer.observe(document.body, {
            childList: true,
            subtree: true
        });
    }

    /**
     * 初始化
     */
    function init() {
        // 排除个人中心页面
        if (window.location.pathname.match(/^\/member\//)) {
            console.log('虎嗅评论过滤插件：跳过个人中心页面');
            return;
        }

        console.log('虎嗅评论过滤插件已启动（无缓存模式）');
        
        // 等待页面加载完成
        if (document.readyState === 'loading') {
            document.addEventListener('DOMContentLoaded', () =&gt; {
                setTimeout(checkAllComments, 2000);
                setupMutationObserver();
            });
        } else {
            setTimeout(checkAllComments, 2000);
            setupMutationObserver();
        }

        // 定期检查新评论
        setInterval(checkAllComments, CHECK_INTERVAL);
    }

    // 启动
    init();
})();
</code></pre></div></div>

<p>效果：<br />
<img src="/assets/images/posts/24-156980598/img-001.png" alt="在这里插入图片描述" /></p>]]></content><author><name>gaoxingliang</name></author><category term="迁移自CSDN" /><category term="javascript" /><category term="虎嗅" /><category term="油猴脚本" /><summary type="html"><![CDATA[本文介绍了一个名为"虎嗅评论过滤"的油猴脚本，用于自动检测并屏蔽虎嗅网站上评论数超过100页的用户评论。脚本通过定时检查新评论（间隔2秒），从评论元素中采用多种方式提取用户ID（包括会员链接、data属性、class/id等），然后向虎嗅API发送请求获取用户评论总页数。当检测到用户评论页数超过设定阈值时，脚本会自动屏蔽该用户的评论。脚本还实现了请求队列管理，避免重复检查同一用户，并包含详细的调试日志功能，便于排查问题。]]></summary></entry><entry><title type="html">Mydumper一致性数据dump</title><link href="https://gaoxingliang.github.io/blog/2026/01/09/mydumper-dump-156764768/" rel="alternate" type="text/html" title="Mydumper一致性数据dump" /><published>2026-01-09T07:41:56+00:00</published><updated>2026-01-09T07:41:56+00:00</updated><id>https://gaoxingliang.github.io/blog/2026/01/09/mydumper-dump-156764768</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2026/01/09/mydumper-dump-156764768/"><![CDATA[<h2 id="背景">背景</h2>

<p>我司购买了一个超大数据库：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT
    table_schema AS database_name,
    table_name,
    table_rows AS approx_rows
FROM information_schema.tables
WHERE table_schema = 'xx'   -- 替换为你的数据库名
  AND engine = 'InnoDB'            -- 可选：只查 InnoDB 表
ORDER BY table_rows DESC;
</code></pre></div></div>

<p><img src="/assets/images/posts/mydumper-dump-156764768/img-001.png" alt="在这里插入图片描述" /></p>

<p>为了能够顺利将mysql 数据dump到clickhouse中进行分析，我计划按照如下的步骤去进行:<br />
 （1）调查mysql 到clickhouse的方式。<br />
 （2）导入数据到clickhouse。<br />
 （3）增量数据导入。</p>

<h2 id="mysql到clickhouse的方式">mysql到clickhouse的方式</h2>

<h3 id="方式1-clickhouse-mysql-engine">方式1 clickhouse mysql engine</h3>

<p>在 https://clickhouse.com/docs/engines/database-engines/mysql 这里可以通过如下语句：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE DATABASE mysql_db ENGINE = MySQL('localhost:3306', 'test', 'my_user', 'user_password') SETTINGS read_write_timeout=10000, connect_timeout=100;
</code></pre></div></div>

<p>但是这个本质还是将所有读写发到远端执行。 无法达到快速分析和你用CLICKHOUSE完整OLAP能力的要求。只支持UPDATE和SELECT。</p>

<h3 id="方式2-clickhouse-materialized-view">方式2 clickhouse materialized view</h3>

<p>之前clickhouse在22版本的时候有通过<code class="language-plaintext highlighter-rouge">MaterializedMySQL</code>来讲数据直接复制到clickhouse中：参考<a href="https://www.percona.com/blog/complete-walkthrough-mysql-to-clickhouse-replication-using-materializedmysql-engine/">链接</a>但是在新版本中被移除了<a href="https://github.com/ClickHouse/ClickHouse/pull/73879">PR</a>：</p>

<p>22版可以用如下方式：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set allow_experimental_database_materialized_mysql = 1;
CREATE DATABASE tableXXX on cluster 'all-nodes' ENGINE = MaterializedMySQL(
'mysql:3306', 'dbxxxx', 'username', 'pass')
settings 
materialized_mysql_tables_list = 'interface_access_log,enterprise_info'
TABLE OVERRIDE interface_access_log (
    PARTITION BY  toYYYYMM(gmt_create)
    ORDER BY (gmt_create, id)
)
</code></pre></div></div>

<p>该方式依赖每个表必须有明确主键。</p>

<h3 id="方式3-将数据导出后-恢复到clickhouse并增量同步">方式3 将数据导出后 恢复到clickhouse并增量同步</h3>

<p>本文计划采用的方式。在调查mysql全量导出的过程中，我也看过相关的导出工具 发现，mysqldump还是太慢了且文件太大，对于上面的数据量。mysqldump的示例：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mysqldump \
  --single-transaction \
  --master-data=2 \
  --routines \
  --triggers \
  --events \
  --hex-blob \
  --default-character-set=utf8mb4 \
  --host=127.0.0.1 \
  --port=3306 \
  --user=backup_user \
  --password='your_password' \
  your_database_name \
  &gt; /backup/your_database_$(date +%Y%m%d).sql
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">--single-transaction</code>保证备份期间视图一致性，且不阻塞正常的CRUD，但是会阻塞DDL（alter table 之类的）。<code class="language-plaintext highlighter-rouge">--master-data=2</code>记录dump是的binlog和pos。<br />
 这个缺点就是太慢了，且生成文件巨大没法简单分析和导入，所以调查了下mydumper这个工具。</p>

<h2 id="mydumper">mydumper</h2>

<p>mydumper项目地址：https://github.com/mydumper/mydumper。 我主要关注他如何高性能且一致性的导出备份。 所以查看了相关实现，发现他主要通过如下方式实现一致性：<br />
 mydumper 在 FTWRL 下保证一致性的典型流程（核心机制）</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>主控制连接获取全局读锁（FTWRL）
    让所有表进入“读锁”状态：阻止新的写入，并等待正在进行的写入结束（达到一个全库静止点）。
    这一刻可以认为数据库处于一个确定的、可描述的时间点。

记录复制/增量所需的位置点（metadata）
在读锁还持有时，mydumper 会读取并写出 metadata（常见包含）：
    binlog file/position（以及可能的 GTID） 这保证“这份 dump 对应主库的哪个位置点”是准确的。

所有 worker 线程在屏障(barrier)下建立“同一时间点快照”
    每个线程通常用独立连接去读各自负责的表/分片。
    在 FTWRL 仍然持有时，mydumper 会让这些连接几乎同时执行一致性读相关设置并开启事务快照（典型是 REPEATABLE READ + 一致性快照 语义）。
    因为此时写入被阻塞，所以这些事务拿到的 read view 等价于同一个时间点。

释放 FTWRL，全库恢复可写；dump 线程继续并发读取
    锁释放后，业务写入可以继续。
    但每个线程都在自己的事务快照里读数据：
    InnoDB 的 MVCC 保证它们看到的仍是“锁释放那一刻”的版本（后续提交的新版本对这些事务不可见）。
</code></pre></div></div>

<p>java伪代码：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Java 伪代码：FTWRL + 多线程一致性快照（barrier 同步）

// Pseudo-code (Java-like), illustrating mydumper's idea:
// 1) Hold FTWRL briefly to freeze writes
// 2) Record binlog/gtid position under lock
// 3) Let all worker connections START consistent snapshots at the same point
// 4) Release FTWRL, workers dump concurrently using their own snapshot
 
class DumpCoordinator {
 
  String host;
  int port;
  String user;
  String password;
  String database;
  int threads;
  long chunkSizeBytes;
 
  void runDump() throws Exception {
    Connection ctrl = openConnection();   // control connection
    ctrl.setAutoCommit(true);
 
    // Barrier to ensure all workers have created snapshot before unlocking
    CyclicBarrier snapshotBarrier = new CyclicBarrier(threads + 1);
 
    ExecutorService pool = Executors.newFixedThreadPool(threads);
    List&lt;TableTask&gt; tasks = planTableAndChunkTasks(database, chunkSizeBytes);
 
    // Start workers first (they will wait until coordinator says "snapshot now")
    for (int i = 0; i &lt; threads; i++) {
      pool.submit(new DumpWorker(i, snapshotBarrier, tasks));
    }
 
    // 1) Acquire global read lock (FTWRL)
    exec(ctrl, "FLUSH TABLES WITH READ LOCK");
 
    // 2) Read metadata (binlog pos / gtid) while lock is held
    BinlogPoint p = readBinlogPoint(ctrl);     // e.g. SHOW MASTER STATUS / SHOW BINARY LOG STATUS
    writeMetadataFile(p);
 
    // 3) Tell workers to create consistent snapshot NOW (while FTWRL is still held)
    // coordinator arrives at barrier; workers also arrive after START TRANSACTION WITH CONSISTENT SNAPSHOT
    snapshotBarrier.await(); // releases only when all workers + coordinator reach it
 
    // 4) Release lock quickly so production can continue
    exec(ctrl, "UNLOCK TABLES");
 
    pool.shutdown();
    pool.awaitTermination(24, TimeUnit.HOURS);
    ctrl.close();
  }
 
  Connection openConnection() { /* DriverManager.getConnection(...) */ return null; }
 
  void exec(Connection c, String sql) { /* execute sql */ }
 
  BinlogPoint readBinlogPoint(Connection c) { /* query master status */ return null; }
 
  void writeMetadataFile(BinlogPoint p) { /* write metadata */ }
 
  List&lt;TableTask&gt; planTableAndChunkTasks(String db, long chunkBytes) {
    // Inspect table sizes / PK ranges, split into chunks:
    // - small tables: one task
    // - large tables: multiple chunk tasks (pk ranges)
    return new ArrayList&lt;&gt;();
  }
}
 
class DumpWorker implements Runnable {
  int workerId;
  CyclicBarrier snapshotBarrier;
  List&lt;TableTask&gt; sharedTasks;
 
  DumpWorker(int workerId, CyclicBarrier barrier, List&lt;TableTask&gt; tasks) {
    this.workerId = workerId;
    this.snapshotBarrier = barrier;
    this.sharedTasks = tasks;
  }
 
  @Override
  public void run() {
    Connection conn = openConnection();
    conn.setAutoCommit(false);
 
    // Ensure snapshot semantics (InnoDB MVCC)
    exec(conn, "SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ");
 
    // Create a consistent snapshot while FTWRL is held
    exec(conn, "START TRANSACTION WITH CONSISTENT SNAPSHOT");
 
    // Signal "snapshot ready"
    await(snapshotBarrier);
 
    // After this point coordinator may UNLOCK TABLES;
    // this worker keeps reading the SAME snapshot via MVCC.
    while (true) {
      TableTask task = pollNextTask(sharedTasks);
      if (task == null) break;
 
      dumpTableOrChunk(conn, task);
    }
 
    exec(conn, "COMMIT");
    closeQuietly(conn);
  }
 
  Connection openConnection() { return null; }
 
  void exec(Connection c, String sql) { /* execute sql */ }
 
  void await(CyclicBarrier b) { /* b.await() */ }
 
  TableTask pollNextTask(List&lt;TableTask&gt; tasks) {
    // synchronized(tasks) { pop next }
    return null;
  }
 
  void dumpTableOrChunk(Connection conn, TableTask task) {
    // Example chunk query:
    // SELECT * FROM db.table WHERE pk &gt;= ? AND pk &lt; ? ORDER BY pk;
    // Stream rows -&gt; write file part
  }
 
  void closeQuietly(Connection c) {}
}
 
class TableTask {
  String table;
  boolean isChunk;
  long pkStartInclusive;
  long pkEndExclusive;
}
 
class BinlogPoint {
  String binlogFile;
  long binlogPos;
  String gtidSet; // optional
}
</code></pre></div></div>

<p>设计还是非常巧妙的，总结起来就是：结合了MVCC + 任务窃取 + Cyclicbarrier 提前连接降低持有锁时长。</p>

<p>这是使用的示例命令：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>--compress 可以压缩生成zst文件
mydumper --sync-thread-lock-mode=FTWRL --port=3306 --host=192.168.102. --user=root --password='xx' --database=xx --outputdir=/root/testdump --threads=8 --chunk-filesize=128 --verbose=3
</code></pre></div></div>

<p>生成的文件中：metadata记录了相关表的文件名—》数据文件的映射 和相关binlog信息：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Started dump at: 2026-01-09 14:17:57
[config]
quote-character = BACKTICK

[myloader_session_variables]
SQL_MODE='NO_AUTO_VALUE_ON_ZERO,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION' /*!40101

[source]
# Channel_Name = '' # It can be use to setup replication FOR CHANNEL
# executed_gtid_set = "0-1099-53297657"
# SOURCE_LOG_FILE = "master-bin.000080"
# SOURCE_LOG_POS = 761443754

[`dbxxx`.`admin_punish_credit`]
real_table_name=admin_punish_credit
rows = 26

[`dbxxx`.`admin_punish`]
real_table_name=admin_punish
rows = 23
....

[`dbxxx`.`mydumper_7`]
real_table_name=费用信息
rows = 10000
[config]
max-statement-size = 999998
num-sequences = 0
# Finished dump at: 2026-01-09 14:18:05
</code></pre></div></div>

<h2 id="下一步">下一步</h2>

<p>下一步就是：解析生成相关的Clickhouse table 和 基于Canal的增量复制。</p>]]></content><author><name>gaoxingliang</name></author><category term="迁移自CSDN" /><category term="数据库" /><category term="mysql" /><summary type="html"><![CDATA[本文探讨了将MySQL大数据库迁移到ClickHouse进行分析的三种方法：1)使用ClickHouse的MySQL引擎进行远程查询，但性能受限；2)利用MaterializedMySQL引擎(22版本)进行复制，但新版本已移除；3)采用mydumper工具导出数据后增量同步。重点分析了mydumper的工作原理，它通过获取全局读锁(FTWRL)确保数据一致性，记录binlog位置，并在多线程环境下创建一致性快照，最后释放锁以最小化对生产环境的影响。该方法相比mysqldump具有更高性能和更小的文件体积。]]></summary></entry><entry><title type="html">PostgreSQL 完全迁移指南：从 MySQL 到 PostgreSQL 的详细教程</title><link href="https://gaoxingliang.github.io/blog/2025/09/22/postgresql-mysql-postgresql-151959836/" rel="alternate" type="text/html" title="PostgreSQL 完全迁移指南：从 MySQL 到 PostgreSQL 的详细教程" /><published>2025-09-22T04:15:03+00:00</published><updated>2025-09-22T04:15:03+00:00</updated><id>https://gaoxingliang.github.io/blog/2025/09/22/postgresql-mysql-postgresql-151959836</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2025/09/22/postgresql-mysql-postgresql-151959836/"><![CDATA[<blockquote>
  <p>专为熟悉 MySQL 但 PostgreSQL 经验有限的高级后端程序员设计的全面迁移指南</p>
</blockquote>

<h3 id="目录">目录</h3>

<ul>
  <li><a href="#postgresql-%E5%9F%BA%E7%A1%80%E6%A6%82%E5%BF%B5">PostgreSQL 基础概念</a></li>
  <li><a href="#%E6%A0%B8%E5%BF%83%E6%9E%B6%E6%9E%84%E5%B7%AE%E5%BC%82%E8%AF%A6%E8%A7%A3">核心架构差异详解</a></li>
  <li><a href="#mvcc-%E6%9C%BA%E5%88%B6%E6%B7%B1%E5%BA%A6%E8%A7%A3%E6%9E%90">MVCC 机制深度解析</a></li>
  <li><a href="#%E5%AE%89%E8%A3%85%E4%B8%8E%E5%9F%BA%E7%A1%80%E9%85%8D%E7%BD%AE">安装与基础配置</a></li>
  <li><a href="#%E6%95%B0%E6%8D%AE%E7%B1%BB%E5%9E%8B%E5%AF%B9%E6%AF%94%E4%B8%8E%E8%BD%AC%E6%8D%A2">数据类型对比与转换</a></li>
  <li><a href="#sql-%E8%AF%AD%E6%B3%95%E5%B7%AE%E5%BC%82%E8%AF%A6%E8%A7%A3">SQL 语法差异详解</a></li>
  <li><a href="#%E6%80%A7%E8%83%BD%E4%BC%98%E5%8C%96%E5%AE%8C%E6%95%B4%E6%8C%87%E5%8D%97">性能优化完整指南</a></li>
  <li><a href="#%E7%B4%A2%E5%BC%95%E7%AD%96%E7%95%A5%E4%B8%8E%E4%BC%98%E5%8C%96">索引策略与优化</a></li>
  <li><a href="#%E5%B8%B8%E8%A7%81%E9%99%B7%E9%98%B1%E4%B8%8E%E8%A7%A3%E5%86%B3%E6%96%B9%E6%A1%88">常见陷阱与解决方案</a></li>
  <li><a href="#%E6%89%A9%E5%B1%95%E7%94%9F%E6%80%81%E7%B3%BB%E7%BB%9F%E8%AF%A6%E8%A7%A3">扩展生态系统详解</a></li>
  <li><a href="#%E7%9B%91%E6%8E%A7%E4%B8%8E%E8%AF%8A%E6%96%AD%E5%AE%8C%E6%95%B4%E6%96%B9%E6%A1%88">监控与诊断完整方案</a></li>
  <li><a href="#%E5%A4%87%E4%BB%BD%E4%B8%8E%E6%81%A2%E5%A4%8D%E7%AD%96%E7%95%A5">备份与恢复策略</a></li>
  <li><a href="#%E9%AB%98%E5%8F%AF%E7%94%A8%E4%B8%8E%E9%9B%86%E7%BE%A4%E9%85%8D%E7%BD%AE">高可用与集群配置</a></li>
  <li><a href="#%E8%BF%81%E7%A7%BB%E7%AD%96%E7%95%A5%E4%B8%8E%E5%B7%A5%E5%85%B7">迁移策略与工具</a></li>
  <li><a href="#%E6%95%85%E9%9A%9C%E6%8E%92%E9%99%A4%E6%8C%87%E5%8D%97">故障排除指南</a></li>
</ul>

<hr />

<h3 id="postgresql-基础概念">PostgreSQL 基础概念</h3>

<h4 id="什么是-postgresql">什么是 PostgreSQL？</h4>

<p>PostgreSQL 是一个功能强大的开源对象关系数据库系统，具有超过 35 年的开发历史。与 MySQL 相比，PostgreSQL 提供了更丰富的功能集和更强的标准兼容性。</p>

<h4 id="核心术语对比">核心术语对比</h4>

<table>
  <thead>
    <tr>
      <th>概念</th>
      <th>MySQL</th>
      <th>PostgreSQL</th>
      <th>说明</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>数据库实例</td>
      <td>Instance</td>
      <td>Cluster</td>
      <td>PostgreSQL 中一个实例可以包含多个数据库</td>
    </tr>
    <tr>
      <td>数据库</td>
      <td>Database</td>
      <td>Database</td>
      <td>概念相似，但 PostgreSQL 支持更多高级特性</td>
    </tr>
    <tr>
      <td>表空间</td>
      <td>Tablespace</td>
      <td>Tablespace</td>
      <td>功能更强大，支持跨数据库使用</td>
    </tr>
    <tr>
      <td>存储引擎</td>
      <td>InnoDB/MyISAM</td>
      <td>统一存储引擎</td>
      <td>PostgreSQL 使用统一的存储引擎</td>
    </tr>
    <tr>
      <td>事务隔离</td>
      <td>4 个级别</td>
      <td>4 个级别</td>
      <td>实现方式不同，PostgreSQL 更严格</td>
    </tr>
  </tbody>
</table>

<h4 id="postgresql-的核心优势">PostgreSQL 的核心优势</h4>

<ol>
  <li><strong>标准兼容性</strong>：严格遵循 SQL 标准</li>
  <li><strong>扩展性</strong>：支持 1200+ 扩展</li>
  <li><strong>数据类型丰富</strong>：支持 JSON、数组、范围类型等</li>
  <li><strong>并发控制</strong>：基于 MVCC 的无锁并发</li>
  <li><strong>ACID 完整性</strong>：完全支持 ACID 特性</li>
</ol>

<hr />

<h3 id="核心架构差异详解">核心架构差异详解</h3>

<h4 id="1-多版本并发控制-mvcc-的根本差异">1. 多版本并发控制 (MVCC) 的根本差异</h4>

<h5 id="mysql-vs-postgresql-mvcc-对比">MySQL vs PostgreSQL MVCC 对比</h5>

<p><strong>MySQL InnoDB MVCC：</strong></p>

<ul>
  <li>使用 <strong>增量存储</strong>：只记录变更的字段</li>
  <li>版本链：newest-to-oldest (N2O)</li>
  <li>回滚段：存储在系统表空间中</li>
  <li>索引：存储逻辑标识符</li>
</ul>

<p><strong>PostgreSQL MVCC：</strong></p>

<ul>
  <li>使用 <strong>追加式存储</strong>：复制整行数据</li>
  <li>版本链：oldest-to-newest (O2N)</li>
  <li>版本存储：与数据混合存储在同一页面</li>
  <li>索引：存储物理地址</li>
</ul>

<h5 id="具体示例对比">具体示例对比</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 假设有一个用户表，包含 50 个字段
CREATE TABLE users (
    id INT PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100),
    -- ... 其他 47 个字段
    last_login TIMESTAMP
);

-- 只更新一个字段
UPDATE users SET last_login = NOW() WHERE id = 1;
</code></pre></div></div>

<p><strong>MySQL 行为：</strong></p>

<ul>
  <li>在回滚段中只存储 <code class="language-plaintext highlighter-rouge">last_login</code> 的旧值</li>
  <li>主表只更新 <code class="language-plaintext highlighter-rouge">last_login</code> 字段</li>
  <li>索引不需要更新（如果 <code class="language-plaintext highlighter-rouge">last_login</code> 没有索引）</li>
</ul>

<p><strong>PostgreSQL 行为：</strong></p>

<ul>
  <li>复制整行数据（50 个字段）到新位置</li>
  <li>更新所有相关索引指向新位置</li>
  <li>原行标记为”死元组”</li>
</ul>

<h5 id="性能影响分析">性能影响分析</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 监控表膨胀情况
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as total_size,
    pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) as table_size,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) as index_size,
    n_dead_tup,
    n_live_tup,
    round(n_dead_tup * 100.0 / (n_live_tup + n_dead_tup), 2) as dead_ratio
FROM pg_stat_user_tables 
WHERE n_dead_tup &gt; 0
ORDER BY dead_ratio DESC;
</code></pre></div></div>

<h4 id="2-存储引擎架构差异">2. 存储引擎架构差异</h4>

<h5 id="mysql-存储引擎架构">MySQL 存储引擎架构</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- MySQL 支持多种存储引擎
CREATE TABLE table1 (id INT) ENGINE=InnoDB;    -- 事务支持
CREATE TABLE table2 (id INT) ENGINE=MyISAM;    -- 非事务
CREATE TABLE table3 (id INT) ENGINE=Memory;    -- 内存表
</code></pre></div></div>

<h5 id="postgresql-统一架构">PostgreSQL 统一架构</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- PostgreSQL 只有一种存储引擎，但支持多种访问方法
CREATE TABLE table1 (id INT);  -- 默认堆表
CREATE TABLE table2 (id INT) USING heap;  -- 显式指定堆表

-- 支持自定义访问方法（通过扩展）
CREATE EXTENSION zheap;  -- 实验性的新存储引擎
CREATE TABLE table3 (id INT) USING zheap;
</code></pre></div></div>

<h5 id="存储引擎对比表">存储引擎对比表</h5>

<table>
  <thead>
    <tr>
      <th>特性</th>
      <th>MySQL InnoDB</th>
      <th>PostgreSQL Heap</th>
      <th>PostgreSQL zheap</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>事务支持</td>
      <td>✅</td>
      <td>✅</td>
      <td>✅</td>
    </tr>
    <tr>
      <td>外键约束</td>
      <td>✅</td>
      <td>✅</td>
      <td>✅</td>
    </tr>
    <tr>
      <td>行级锁定</td>
      <td>✅</td>
      <td>✅</td>
      <td>✅</td>
    </tr>
    <tr>
      <td>崩溃恢复</td>
      <td>✅</td>
      <td>✅</td>
      <td>✅</td>
    </tr>
    <tr>
      <td>版本存储</td>
      <td>增量</td>
      <td>整行复制</td>
      <td>增量（实验性）</td>
    </tr>
    <tr>
      <td>表膨胀</td>
      <td>较少</td>
      <td>较多</td>
      <td>较少</td>
    </tr>
    <tr>
      <td>索引维护</td>
      <td>逻辑ID</td>
      <td>物理地址</td>
      <td>逻辑ID</td>
    </tr>
  </tbody>
</table>

<h4 id="3-数据类型系统差异">3. 数据类型系统差异</h4>

<h5 id="mysql-数据类型特点">MySQL 数据类型特点</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- MySQL 相对简单的数据类型
CREATE TABLE mysql_example (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255),
    email VARCHAR(255),
    age TINYINT,
    salary DECIMAL(10,2),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    data JSON
);
</code></pre></div></div>

<h5 id="postgresql-丰富的数据类型">PostgreSQL 丰富的数据类型</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- PostgreSQL 支持更丰富的数据类型
CREATE TABLE postgres_example (
    id SERIAL PRIMARY KEY,  -- 自增序列
    name VARCHAR(255),
    email VARCHAR(255),
    age SMALLINT,  -- 更精确的整数类型
    salary NUMERIC(10,2),  -- 精确数值
    created_at TIMESTAMPTZ DEFAULT NOW(),  -- 带时区的时间戳
    data JSONB,  -- 二进制JSON，支持索引
    tags TEXT[],  -- 数组类型
    status user_status,  -- 枚举类型
    location POINT,  -- 几何类型
    search_vector TSVECTOR,  -- 全文搜索向量
    valid_period DATERANGE  -- 范围类型
);

-- 创建枚举类型
CREATE TYPE user_status AS ENUM ('active', 'inactive', 'pending');
</code></pre></div></div>

<h5 id="数据类型映射表">数据类型映射表</h5>

<table>
  <thead>
    <tr>
      <th>MySQL 类型</th>
      <th>PostgreSQL 类型</th>
      <th>说明</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">INT AUTO_INCREMENT</code></td>
      <td><code class="language-plaintext highlighter-rouge">SERIAL</code> 或 <code class="language-plaintext highlighter-rouge">BIGSERIAL</code></td>
      <td>自增主键</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">VARCHAR(n)</code></td>
      <td><code class="language-plaintext highlighter-rouge">VARCHAR(n)</code> 或 <code class="language-plaintext highlighter-rouge">TEXT</code></td>
      <td>字符串类型</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TINYINT</code></td>
      <td><code class="language-plaintext highlighter-rouge">SMALLINT</code></td>
      <td>小整数</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DECIMAL(p,s)</code></td>
      <td><code class="language-plaintext highlighter-rouge">NUMERIC(p,s)</code></td>
      <td>精确数值</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TIMESTAMP</code></td>
      <td><code class="language-plaintext highlighter-rouge">TIMESTAMPTZ</code></td>
      <td>带时区时间戳</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">JSON</code></td>
      <td><code class="language-plaintext highlighter-rouge">JSONB</code></td>
      <td>二进制JSON</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ENUM</code></td>
      <td><code class="language-plaintext highlighter-rouge">ENUM</code> 或 <code class="language-plaintext highlighter-rouge">CHECK</code></td>
      <td>枚举值</td>
    </tr>
    <tr>
      <td>-</td>
      <td><code class="language-plaintext highlighter-rouge">ARRAY</code></td>
      <td>数组类型（MySQL 不支持）</td>
    </tr>
    <tr>
      <td>-</td>
      <td><code class="language-plaintext highlighter-rouge">RANGE</code></td>
      <td>范围类型（MySQL 不支持）</td>
    </tr>
    <tr>
      <td>-</td>
      <td><code class="language-plaintext highlighter-rouge">UUID</code></td>
      <td>UUID 类型（MySQL 不支持）</td>
    </tr>
  </tbody>
</table>

<hr />

<h3 id="mvcc-机制深度解析">MVCC 机制深度解析</h3>

<h4 id="什么是-mvcc">什么是 MVCC？</h4>

<p>多版本并发控制（MVCC）是一种数据库并发控制方法，允许多个事务同时读取和写入数据库，而不会相互阻塞。PostgreSQL 的 MVCC 实现与 MySQL 有根本性差异。</p>

<h4 id="postgresql-mvcc-工作原理">PostgreSQL MVCC 工作原理</h4>

<h5 id="1-版本存储机制">1. 版本存储机制</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 创建测试表
CREATE TABLE test_mvcc (
    id SERIAL PRIMARY KEY,
    name VARCHAR(50),
    value INTEGER
);

-- 插入初始数据
INSERT INTO test_mvcc (name, value) VALUES ('test', 100);

-- 查看元组信息
SELECT ctid, xmin, xmax, * FROM test_mvcc;
-- ctid: 物理位置 (页面号, 行号)
-- xmin: 创建此版本的事务ID
-- xmax: 删除此版本的事务ID (0表示未删除)
</code></pre></div></div>

<h5 id="2-更新操作详解">2. 更新操作详解</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 开始事务
BEGIN;

-- 更新操作
UPDATE test_mvcc SET value = 200 WHERE id = 1;

-- 在另一个会话中查看
SELECT ctid, xmin, xmax, * FROM test_mvcc;
-- 会看到新的 ctid，说明数据被复制到新位置

COMMIT;
</code></pre></div></div>

<h5 id="3-版本链遍历">3. 版本链遍历</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 模拟多次更新
BEGIN;
UPDATE test_mvcc SET value = 300 WHERE id = 1;
UPDATE test_mvcc SET value = 400 WHERE id = 1;
COMMIT;

-- 查看版本链（需要特殊工具或扩展）
-- 正常情况下只能看到最新版本
</code></pre></div></div>

<h4 id="postgresql-mvcc-的四大问题">PostgreSQL MVCC 的四大问题</h4>

<h5 id="1-版本复制开销">1. 版本复制开销</h5>

<p><strong>问题描述：</strong><br />
 PostgreSQL 在更新时复制整行数据，即使只修改一个字段。</p>

<p><strong>具体示例：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 创建一个包含很多字段的表
CREATE TABLE large_table (
    id SERIAL PRIMARY KEY,
    field1 VARCHAR(100),
    field2 VARCHAR(100),
    field3 VARCHAR(100),
    -- ... 假设有 100 个字段
    field100 VARCHAR(100),
    status VARCHAR(20)
);

-- 只更新一个字段
UPDATE large_table SET status = 'active' WHERE id = 1;
-- PostgreSQL 会复制所有 100 个字段到新位置
</code></pre></div></div>

<p><strong>性能影响：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 监控表大小变化
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size,
    n_tup_ins as inserts,
    n_tup_upd as updates,
    n_tup_del as deletes
FROM pg_stat_user_tables 
WHERE tablename = 'large_table';
</code></pre></div></div>

<p><strong>解决方案：</strong></p>

<ol>
  <li><strong>使用 zheap 扩展（实验性）：</strong></li>
</ol>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 安装 zheap 扩展（需要编译支持）
CREATE EXTENSION zheap;

-- 使用 zheap 存储引擎
CREATE TABLE optimized_table (
    id SERIAL PRIMARY KEY,
    data TEXT,
    status VARCHAR(20)
) USING zheap;
</code></pre></div></div>

<ol>
  <li><strong>表结构优化：</strong></li>
</ol>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 避免过宽的表，考虑垂直分表
CREATE TABLE user_basic_info (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100)
);

CREATE TABLE user_extended_info (
    user_id INTEGER REFERENCES user_basic_info(id),
    profile_data JSONB,
    preferences JSONB
);
</code></pre></div></div>

<ol>
  <li><strong>使用 pg_repack 定期重组：</strong></li>
</ol>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 安装 pg_repack
# Ubuntu/Debian
sudo apt-get install postgresql-15-repack

# 重组表
pg_repack -d your_database -t your_table
</code></pre></div></div>

<h5 id="2-表膨胀问题">2. 表膨胀问题</h5>

<p><strong>问题描述：</strong><br />
 死元组（dead tuples）占用存储空间，影响查询性能。</p>

<p><strong>监控表膨胀：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 创建监控视图
CREATE OR REPLACE VIEW table_bloat_monitor AS
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as total_size,
    pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) as table_size,
    n_live_tup as live_tuples,
    n_dead_tup as dead_tuples,
    CASE 
        WHEN n_live_tup + n_dead_tup &gt; 0 
        THEN round(n_dead_tup * 100.0 / (n_live_tup + n_dead_tup), 2)
        ELSE 0 
    END as dead_ratio,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze
FROM pg_stat_user_tables 
WHERE n_dead_tup &gt; 0
ORDER BY dead_ratio DESC;

-- 使用监控视图
SELECT * FROM table_bloat_monitor WHERE dead_ratio &gt; 10;
</code></pre></div></div>

<p><strong>Autovacuum 配置优化：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 全局配置
ALTER SYSTEM SET autovacuum = on;
ALTER SYSTEM SET autovacuum_max_workers = 3;
ALTER SYSTEM SET autovacuum_naptime = '1min';
ALTER SYSTEM SET autovacuum_vacuum_threshold = 50;
ALTER SYSTEM SET autovacuum_analyze_threshold = 50;
ALTER SYSTEM SET autovacuum_vacuum_scale_factor = 0.1;  -- 降低到10%
ALTER SYSTEM SET autovacuum_analyze_scale_factor = 0.05;  -- 降低到5%

-- 表级配置（针对大表）
ALTER TABLE large_table SET (
    autovacuum_vacuum_scale_factor = 0.05,  -- 5% 触发
    autovacuum_analyze_scale_factor = 0.02,  -- 2% 触发
    autovacuum_vacuum_cost_delay = 10,  -- 降低延迟
    autovacuum_vacuum_cost_limit = 1000  -- 增加限制
);

-- 重载配置
SELECT pg_reload_conf();
</code></pre></div></div>

<p><strong>手动 Vacuum 操作：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 普通 vacuum（不阻塞读写）
VACUUM ANALYZE your_table;

-- 完整 vacuum（阻塞写入，回收空间）
VACUUM FULL your_table;

-- 使用 pg_repack（在线重组，不阻塞）
-- pg_repack -d your_database -t your_table
</code></pre></div></div>

<h5 id="3-索引维护开销">3. 索引维护开销</h5>

<p><strong>问题描述：</strong><br />
 每次更新都需要更新所有相关索引。</p>

<p><strong>HOT 更新优化：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 创建支持 HOT 的表结构
CREATE TABLE hot_optimized (
    id SERIAL PRIMARY KEY,
    name VARCHAR(50),
    email VARCHAR(100),
    status VARCHAR(20),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- 创建索引（只对需要查询的字段）
CREATE INDEX idx_hot_optimized_name ON hot_optimized(name);
CREATE INDEX idx_hot_optimized_status ON hot_optimized(status);

-- 更新不涉及索引字段的列（HOT 更新）
UPDATE hot_optimized SET email = 'new@example.com' WHERE id = 1;
-- 这个更新可能使用 HOT，因为 email 字段没有索引
</code></pre></div></div>

<p><strong>监控 HOT 更新：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 查看 HOT 更新统计
SELECT 
    schemaname,
    tablename,
    n_tup_hot_upd as hot_updates,
    n_tup_upd as total_updates,
    CASE 
        WHEN n_tup_upd &gt; 0 
        THEN round(n_tup_hot_upd * 100.0 / n_tup_upd, 2)
        ELSE 0 
    END as hot_ratio
FROM pg_stat_user_tables 
WHERE n_tup_upd &gt; 0
ORDER BY hot_ratio DESC;
</code></pre></div></div>

<p><strong>索引设计优化：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 避免在频繁更新的字段上创建索引
-- 错误示例：在状态字段上创建索引，但状态经常变化
CREATE INDEX idx_bad_status ON orders(status);  -- 避免

-- 正确示例：在相对稳定的字段上创建索引
CREATE INDEX idx_good_customer ON orders(customer_id);  -- 推荐

-- 使用部分索引
CREATE INDEX idx_active_orders ON orders(customer_id) 
WHERE status = 'active';
</code></pre></div></div>

<h5 id="4-vacuum-管理复杂性">4. Vacuum 管理复杂性</h5>

<p><strong>监控 Vacuum 状态：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 创建 Vacuum 监控视图
CREATE OR REPLACE VIEW vacuum_monitor AS
SELECT 
    schemaname,
    tablename,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze,
    vacuum_count,
    autovacuum_count,
    analyze_count,
    autoanalyze_count,
    CASE 
        WHEN last_autovacuum IS NULL THEN 'Never'
        WHEN last_autovacuum &lt; NOW() - INTERVAL '1 day' THEN 'Stale'
        ELSE 'Recent'
    END as vacuum_status
FROM pg_stat_user_tables
ORDER BY last_autovacuum NULLS FIRST;

-- 使用监控视图
SELECT * FROM vacuum_monitor WHERE vacuum_status IN ('Never', 'Stale');
</code></pre></div></div>

<p><strong>Vacuum 阻塞问题：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 查看长时间运行的事务
SELECT 
    pid,
    usename,
    application_name,
    client_addr,
    state,
    query_start,
    now() - query_start as duration,
    query
FROM pg_stat_activity 
WHERE state IN ('active', 'idle in transaction')
  AND now() - query_start &gt; INTERVAL '1 hour'
ORDER BY duration DESC;

-- 查看 Vacuum 进程
SELECT 
    pid,
    usename,
    application_name,
    state,
    query_start,
    query
FROM pg_stat_activity 
WHERE query LIKE '%VACUUM%' OR query LIKE '%ANALYZE%';
</code></pre></div></div>

<p><strong>Vacuum 调优策略：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 针对不同表设置不同的 Vacuum 策略
-- 大表：更频繁的 Vacuum
ALTER TABLE large_frequently_updated_table SET (
    autovacuum_vacuum_scale_factor = 0.02,  -- 2%
    autovacuum_analyze_scale_factor = 0.01,  -- 1%
    autovacuum_vacuum_cost_delay = 5,  -- 更积极的 Vacuum
    autovacuum_vacuum_cost_limit = 2000
);

-- 小表：标准设置
ALTER TABLE small_stable_table SET (
    autovacuum_vacuum_scale_factor = 0.2,  -- 20%
    autovacuum_analyze_scale_factor = 0.1   -- 10%
);

-- 只读表：禁用 Autovacuum
ALTER TABLE read_only_table SET (
    autovacuum_enabled = false
);
</code></pre></div></div>

<hr />

<h3 id="安装与基础配置">安装与基础配置</h3>

<h4 id="postgresql-安装">PostgreSQL 安装</h4>

<h5 id="ubuntudebian-安装">Ubuntu/Debian 安装</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 添加 PostgreSQL 官方仓库
sudo apt update
sudo apt install -y wget ca-certificates
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" | sudo tee /etc/apt/sources.list.d/pgdg.list

# 安装 PostgreSQL 15
sudo apt update
sudo apt install -y postgresql-15 postgresql-client-15 postgresql-contrib-15

# 启动服务
sudo systemctl start postgresql
sudo systemctl enable postgresql
</code></pre></div></div>

<h5 id="centosrhel-安装">CentOS/RHEL 安装</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 安装 PostgreSQL 官方仓库
sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

# 安装 PostgreSQL 15
sudo yum install -y postgresql15-server postgresql15 postgresql15-contrib

# 初始化数据库
sudo /usr/pgsql-15/bin/postgresql-15-setup initdb

# 启动服务
sudo systemctl start postgresql-15
sudo systemctl enable postgresql-15
</code></pre></div></div>

<h5 id="docker-安装">Docker 安装</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 使用 Docker 运行 PostgreSQL
docker run --name postgres-15 \
  -e POSTGRES_PASSWORD=your_password \
  -e POSTGRES_DB=your_database \
  -p 5432:5432 \
  -v postgres_data:/var/lib/postgresql/data \
  -d postgres:15

# 连接到容器
docker exec -it postgres-15 psql -U postgres
</code></pre></div></div>

<h4 id="基础配置">基础配置</h4>

<h5 id="1-连接配置">1. 连接配置</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 编辑 postgresql.conf
sudo nano /etc/postgresql/15/main/postgresql.conf

# 关键配置项
listen_addresses = '*'          # 允许外部连接
port = 5432                     # 端口号
max_connections = 100           # 最大连接数
shared_buffers = 256MB          # 共享缓冲区
effective_cache_size = 1GB      # 有效缓存大小
work_mem = 4MB                  # 工作内存
maintenance_work_mem = 64MB     # 维护工作内存
</code></pre></div></div>

<h5 id="2-认证配置">2. 认证配置</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 编辑 pg_hba.conf
sudo nano /etc/postgresql/15/main/pg_hba.conf

# 添加连接规则
# TYPE  DATABASE        USER            ADDRESS                 METHOD
local   all             postgres                                peer
local   all             all                                     md5
host    all             all             127.0.0.1/32            md5
host    all             all             ::1/128                 md5
host    all             all             0.0.0.0/0               md5
</code></pre></div></div>

<h5 id="3-重启服务">3. 重启服务</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 重启 PostgreSQL 服务
sudo systemctl restart postgresql

# 检查服务状态
sudo systemctl status postgresql

# 查看日志
sudo journalctl -u postgresql -f
</code></pre></div></div>

<h4 id="用户和权限管理">用户和权限管理</h4>

<h5 id="创建用户和数据库">创建用户和数据库</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 连接到 PostgreSQL
sudo -u postgres psql

-- 创建用户
CREATE USER app_user WITH PASSWORD 'secure_password';

-- 创建数据库
CREATE DATABASE app_database OWNER app_user;

-- 授予权限
GRANT ALL PRIVILEGES ON DATABASE app_database TO app_user;

-- 连接到新数据库
\c app_database

-- 授予模式权限
GRANT ALL ON SCHEMA public TO app_user;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO app_user;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO app_user;

-- 设置默认权限
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO app_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO app_user;
</code></pre></div></div>

<h5 id="角色管理">角色管理</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 创建角色
CREATE ROLE readonly_role;
CREATE ROLE write_role;

-- 授予权限
GRANT CONNECT ON DATABASE app_database TO readonly_role;
GRANT USAGE ON SCHEMA public TO readonly_role;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_role;

GRANT CONNECT ON DATABASE app_database TO write_role;
GRANT USAGE ON SCHEMA public TO write_role;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO write_role;

-- 将用户添加到角色
GRANT readonly_role TO app_user;
GRANT write_role TO app_user;
</code></pre></div></div>

<hr />

<h3 id="数据类型对比与转换">数据类型对比与转换</h3>

<h4 id="数值类型">数值类型</h4>

<h5 id="mysql-vs-postgresql-数值类型">MySQL vs PostgreSQL 数值类型</h5>

<table>
  <thead>
    <tr>
      <th>MySQL 类型</th>
      <th>PostgreSQL 类型</th>
      <th>说明</th>
      <th>示例</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TINYINT</code></td>
      <td><code class="language-plaintext highlighter-rouge">SMALLINT</code></td>
      <td>小整数</td>
      <td><code class="language-plaintext highlighter-rouge">SMALLINT</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">SMALLINT</code></td>
      <td><code class="language-plaintext highlighter-rouge">SMALLINT</code></td>
      <td>小整数</td>
      <td><code class="language-plaintext highlighter-rouge">SMALLINT</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">MEDIUMINT</code></td>
      <td><code class="language-plaintext highlighter-rouge">INTEGER</code></td>
      <td>中等整数</td>
      <td><code class="language-plaintext highlighter-rouge">INTEGER</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">INT</code></td>
      <td><code class="language-plaintext highlighter-rouge">INTEGER</code></td>
      <td>整数</td>
      <td><code class="language-plaintext highlighter-rouge">INTEGER</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">BIGINT</code></td>
      <td><code class="language-plaintext highlighter-rouge">BIGINT</code></td>
      <td>大整数</td>
      <td><code class="language-plaintext highlighter-rouge">BIGINT</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DECIMAL(p,s)</code></td>
      <td><code class="language-plaintext highlighter-rouge">NUMERIC(p,s)</code></td>
      <td>精确数值</td>
      <td><code class="language-plaintext highlighter-rouge">NUMERIC(10,2)</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">FLOAT</code></td>
      <td><code class="language-plaintext highlighter-rouge">REAL</code></td>
      <td>单精度浮点</td>
      <td><code class="language-plaintext highlighter-rouge">REAL</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DOUBLE</code></td>
      <td><code class="language-plaintext highlighter-rouge">DOUBLE PRECISION</code></td>
      <td>双精度浮点</td>
      <td><code class="language-plaintext highlighter-rouge">DOUBLE PRECISION</code></td>
    </tr>
  </tbody>
</table>

<h5 id="数值类型示例">数值类型示例</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- MySQL 表结构
CREATE TABLE mysql_numeric (
    id TINYINT AUTO_INCREMENT PRIMARY KEY,
    small_num SMALLINT,
    medium_num MEDIUMINT,
    normal_num INT,
    big_num BIGINT,
    decimal_num DECIMAL(10,2),
    float_num FLOAT,
    double_num DOUBLE
);

-- PostgreSQL 对应表结构
CREATE TABLE postgres_numeric (
    id SMALLSERIAL PRIMARY KEY,  -- 自增小整数
    small_num SMALLINT,
    medium_num INTEGER,          -- MEDIUMINT 映射到 INTEGER
    normal_num INTEGER,
    big_num BIGINT,
    decimal_num NUMERIC(10,2),   -- DECIMAL 改为 NUMERIC
    float_num REAL,              -- FLOAT 改为 REAL
    double_num DOUBLE PRECISION  -- DOUBLE 改为 DOUBLE PRECISION
);
</code></pre></div></div>

<h4 id="字符串类型">字符串类型</h4>

<h5 id="字符串类型对比">字符串类型对比</h5>

<table>
  <thead>
    <tr>
      <th>MySQL 类型</th>
      <th>PostgreSQL 类型</th>
      <th>说明</th>
      <th>示例</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CHAR(n)</code></td>
      <td><code class="language-plaintext highlighter-rouge">CHAR(n)</code></td>
      <td>固定长度字符串</td>
      <td><code class="language-plaintext highlighter-rouge">CHAR(10)</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">VARCHAR(n)</code></td>
      <td><code class="language-plaintext highlighter-rouge">VARCHAR(n)</code></td>
      <td>可变长度字符串</td>
      <td><code class="language-plaintext highlighter-rouge">VARCHAR(255)</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
      <td>长文本</td>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TINYTEXT</code></td>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
      <td>短文本</td>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">MEDIUMTEXT</code></td>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
      <td>中等文本</td>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">LONGTEXT</code></td>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
      <td>长文本</td>
      <td><code class="language-plaintext highlighter-rouge">TEXT</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ENUM</code></td>
      <td><code class="language-plaintext highlighter-rouge">ENUM</code> 或 <code class="language-plaintext highlighter-rouge">CHECK</code></td>
      <td>枚举类型</td>
      <td><code class="language-plaintext highlighter-rouge">ENUM('a','b','c')</code></td>
    </tr>
  </tbody>
</table>

<h5 id="字符串类型示例">字符串类型示例</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- MySQL 字符串表
CREATE TABLE mysql_strings (
    id INT AUTO_INCREMENT PRIMARY KEY,
    fixed_char CHAR(10),
    variable_char VARCHAR(255),
    long_text LONGTEXT,
    status ENUM('active', 'inactive', 'pending')
);

-- PostgreSQL 对应表结构
CREATE TYPE user_status AS ENUM ('active', 'inactive', 'pending');

CREATE TABLE postgres_strings (
    id SERIAL PRIMARY KEY,
    fixed_char CHAR(10),
    variable_char VARCHAR(255),
    long_text TEXT,                    -- 所有文本类型统一为 TEXT
    status user_status                 -- 使用自定义枚举类型
);

-- 或者使用 CHECK 约束
CREATE TABLE postgres_strings_check (
    id SERIAL PRIMARY KEY,
    fixed_char CHAR(10),
    variable_char VARCHAR(255),
    long_text TEXT,
    status VARCHAR(20) CHECK (status IN ('active', 'inactive', 'pending'))
);
</code></pre></div></div>

<h4 id="日期时间类型">日期时间类型</h4>

<h5 id="日期时间类型对比">日期时间类型对比</h5>

<table>
  <thead>
    <tr>
      <th>MySQL 类型</th>
      <th>PostgreSQL 类型</th>
      <th>说明</th>
      <th>示例</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DATE</code></td>
      <td><code class="language-plaintext highlighter-rouge">DATE</code></td>
      <td>日期</td>
      <td><code class="language-plaintext highlighter-rouge">DATE</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TIME</code></td>
      <td><code class="language-plaintext highlighter-rouge">TIME</code></td>
      <td>时间</td>
      <td><code class="language-plaintext highlighter-rouge">TIME</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DATETIME</code></td>
      <td><code class="language-plaintext highlighter-rouge">TIMESTAMP</code></td>
      <td>日期时间</td>
      <td><code class="language-plaintext highlighter-rouge">TIMESTAMP</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TIMESTAMP</code></td>
      <td><code class="language-plaintext highlighter-rouge">TIMESTAMPTZ</code></td>
      <td>带时区时间戳</td>
      <td><code class="language-plaintext highlighter-rouge">TIMESTAMPTZ</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">YEAR</code></td>
      <td><code class="language-plaintext highlighter-rouge">SMALLINT</code></td>
      <td>年份</td>
      <td><code class="language-plaintext highlighter-rouge">SMALLINT</code></td>
    </tr>
  </tbody>
</table>

<h5 id="日期时间类型示例">日期时间类型示例</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- MySQL 日期时间表
CREATE TABLE mysql_datetime (
    id INT AUTO_INCREMENT PRIMARY KEY,
    birth_date DATE,
    work_time TIME,
    created_at DATETIME,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    birth_year YEAR
);

-- PostgreSQL 对应表结构
CREATE TABLE postgres_datetime (
    id SERIAL PRIMARY KEY,
    birth_date DATE,
    work_time TIME,
    created_at TIMESTAMP,                    -- DATETIME 改为 TIMESTAMP
    updated_at TIMESTAMPTZ DEFAULT NOW(),   -- 带时区的时间戳
    birth_year SMALLINT                     -- YEAR 改为 SMALLINT
);

-- 创建自动更新触发器
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ language 'plpgsql';

CREATE TRIGGER update_postgres_datetime_updated_at 
    BEFORE UPDATE ON postgres_datetime 
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
</code></pre></div></div>

<h4 id="json-类型">JSON 类型</h4>

<h5 id="json-类型对比">JSON 类型对比</h5>

<table>
  <thead>
    <tr>
      <th>MySQL 类型</th>
      <th>PostgreSQL 类型</th>
      <th>说明</th>
      <th>优势</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">JSON</code></td>
      <td><code class="language-plaintext highlighter-rouge">JSONB</code></td>
      <td>二进制 JSON</td>
      <td>支持索引，查询更快</td>
    </tr>
  </tbody>
</table>

<h5 id="json-类型示例">JSON 类型示例</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- MySQL JSON 表
CREATE TABLE mysql_json (
    id INT AUTO_INCREMENT PRIMARY KEY,
    user_data JSON,
    settings JSON
);

-- PostgreSQL JSONB 表
CREATE TABLE postgres_jsonb (
    id SERIAL PRIMARY KEY,
    user_data JSONB,    -- 使用 JSONB 而不是 JSON
    settings JSONB
);

-- 创建 GIN 索引支持 JSON 查询
CREATE INDEX idx_user_data_gin ON postgres_jsonb USING gin (user_data);
CREATE INDEX idx_settings_gin ON postgres_jsonb USING gin (settings);

-- JSON 查询示例
-- MySQL 查询
SELECT * FROM mysql_json WHERE JSON_EXTRACT(user_data, '$.name') = 'John';

-- PostgreSQL 查询
SELECT * FROM postgres_jsonb WHERE user_data-&gt;&gt;'name' = 'John';
SELECT * FROM postgres_jsonb WHERE user_data @&gt; '{"status": "active"}';
SELECT * FROM postgres_jsonb WHERE user_data ? 'email';
</code></pre></div></div>

<h4 id="数组类型">数组类型</h4>

<h5 id="postgresql-独有的数组类型">PostgreSQL 独有的数组类型</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- PostgreSQL 支持数组类型（MySQL 不支持）
CREATE TABLE postgres_arrays (
    id SERIAL PRIMARY KEY,
    tags TEXT[],                    -- 文本数组
    scores INTEGER[],               -- 整数数组
    coordinates FLOAT[][],          -- 二维浮点数组
    metadata JSONB[]                -- JSONB 数组
);

-- 插入数组数据
INSERT INTO postgres_arrays (tags, scores, coordinates, metadata) VALUES (
    ARRAY['tag1', 'tag2', 'tag3'],
    ARRAY[85, 92, 78],
    ARRAY[[1.0, 2.0], [3.0, 4.0]],
    ARRAY['{"key": "value1"}', '{"key": "value2"}']
);

-- 数组查询
SELECT * FROM postgres_arrays WHERE 'tag1' = ANY(tags);
SELECT * FROM postgres_arrays WHERE array_length(scores, 1) &gt; 2;
SELECT * FROM postgres_arrays WHERE tags @&gt; ARRAY['tag1'];

-- 创建数组索引
CREATE INDEX idx_tags_gin ON postgres_arrays USING gin (tags);
</code></pre></div></div>

<h4 id="范围类型">范围类型</h4>

<h5 id="postgresql-独有的范围类型">PostgreSQL 独有的范围类型</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- PostgreSQL 支持范围类型（MySQL 不支持）
CREATE TABLE postgres_ranges (
    id SERIAL PRIMARY KEY,
    price_range NUMRANGE,           -- 数值范围
    date_range DATERANGE,           -- 日期范围
    time_range TSRANGE,             -- 时间戳范围
    text_range INTRANGE             -- 整数范围
);

-- 插入范围数据
INSERT INTO postgres_ranges (price_range, date_range, time_range, text_range) VALUES (
    '[100, 500)',                   -- 100 到 500（不包含 500）
    '[2023-01-01, 2023-12-31]',    -- 2023 年全年
    '[2023-01-01 00:00:00, 2023-01-01 23:59:59]',
    '[1, 10]'                       -- 1 到 10
);

-- 范围查询
SELECT * FROM postgres_ranges WHERE price_range @&gt; 250;  -- 包含 250
SELECT * FROM postgres_ranges WHERE date_range &amp;&amp; '[2023-06-01, 2023-06-30]';  -- 重叠
SELECT * FROM postgres_ranges WHERE price_range &lt;@ '[0, 1000]';  -- 被包含

-- 创建范围索引
CREATE INDEX idx_price_range ON postgres_ranges USING gist (price_range);
</code></pre></div></div>

<hr />

<h3 id="性能优化关键点">性能优化关键点</h3>

<h4 id="1-内存配置优化">1. 内存配置优化</h4>

<h5 id="work_mem-配置详解">work_mem 配置详解</h5>

<p><strong>work_mem 配置公式：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 推荐公式
work_mem = (总内存 * 0.8 - shared_buffers) / 活跃连接数

-- 示例：16GB 内存，100 个连接
work_mem = (16GB * 0.8 - 4GB) / 100 = 96MB
</code></pre></div></div>

<p><strong>work_mem 影响的操作：</strong></p>

<ul>
  <li>排序操作（ORDER BY）</li>
  <li>哈希连接（Hash Join）</li>
  <li>哈希聚合（Hash Aggregate）</li>
  <li>位图操作（Bitmap operations）</li>
</ul>

<p><strong>监控 work_mem 使用：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 查看临时文件使用情况
SELECT 
    datname,
    temp_files,
    temp_bytes,
    pg_size_pretty(temp_bytes) as temp_size
FROM pg_stat_database 
WHERE temp_files &gt; 0
ORDER BY temp_bytes DESC;

-- 查看当前排序操作
SELECT 
    pid,
    usename,
    application_name,
    query,
    state
FROM pg_stat_activity 
WHERE query LIKE '%ORDER BY%' 
   OR query LIKE '%GROUP BY%'
   OR query LIKE '%DISTINCT%';
</code></pre></div></div>

<h5 id="关键参数对比">关键参数对比</h5>

<table>
  <thead>
    <tr>
      <th>参数</th>
      <th>MySQL 对应</th>
      <th>PostgreSQL 建议</th>
      <th>说明</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>work_mem</td>
      <td>sort_buffer_size</td>
      <td>64MB-256MB</td>
      <td>排序/哈希操作内存</td>
    </tr>
    <tr>
      <td>shared_buffers</td>
      <td>innodb_buffer_pool_size</td>
      <td>25% 总内存</td>
      <td>共享缓存</td>
    </tr>
    <tr>
      <td>effective_cache_size</td>
      <td>-</td>
      <td>75% 总内存</td>
      <td>查询规划器参考</td>
    </tr>
    <tr>
      <td>maintenance_work_mem</td>
      <td>-</td>
      <td>256MB-1GB</td>
      <td>维护操作内存</td>
    </tr>
    <tr>
      <td>temp_buffers</td>
      <td>-</td>
      <td>8MB</td>
      <td>临时表缓冲区</td>
    </tr>
  </tbody>
</table>

<h5 id="内存配置示例">内存配置示例</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 针对不同规模系统的配置建议

-- 小型系统 (4GB 内存)
shared_buffers = 1GB
effective_cache_size = 3GB
work_mem = 4MB
maintenance_work_mem = 64MB
temp_buffers = 8MB

-- 中型系统 (16GB 内存)
shared_buffers = 4GB
effective_cache_size = 12GB
work_mem = 16MB
maintenance_work_mem = 256MB
temp_buffers = 8MB

-- 大型系统 (64GB 内存)
shared_buffers = 16GB
effective_cache_size = 48GB
work_mem = 64MB
maintenance_work_mem = 1GB
temp_buffers = 8MB
</code></pre></div></div>

<h4 id="2-连接管理优化">2. 连接管理优化</h4>

<h5 id="连接池配置">连接池配置</h5>

<p><strong>pgbouncer 配置示例：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># /etc/pgbouncer/pgbouncer.ini
[databases]
app_db = host=127.0.0.1 port=5432 dbname=app_database pool_size=100

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 100
reserve_pool_size = 10
reserve_pool_timeout = 5
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
</code></pre></div></div>

<p><strong>连接池模式对比：</strong></p>

<table>
  <thead>
    <tr>
      <th>模式</th>
      <th>连接复用</th>
      <th>事务隔离</th>
      <th>适用场景</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Session</td>
      <td>低</td>
      <td>完整</td>
      <td>需要会话状态的应用</td>
    </tr>
    <tr>
      <td>Transaction</td>
      <td>高</td>
      <td>事务级</td>
      <td>无状态应用</td>
    </tr>
    <tr>
      <td>Statement</td>
      <td>最高</td>
      <td>语句级</td>
      <td>简单查询应用</td>
    </tr>
  </tbody>
</table>

<h5 id="连接监控">连接监控</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 查看当前连接
SELECT 
    datname,
    usename,
    application_name,
    client_addr,
    state,
    query_start,
    now() - query_start as duration,
    query
FROM pg_stat_activity 
WHERE state != 'idle'
ORDER BY query_start;

-- 查看连接统计
SELECT 
    datname,
    numbackends as current_connections,
    max_connections,
    round(numbackends * 100.0 / max_connections, 2) as connection_usage
FROM pg_stat_database 
JOIN pg_database ON pg_stat_database.datname = pg_database.datname;
</code></pre></div></div>

<h4 id="2-查询优化策略">2. 查询优化策略</h4>

<p><strong>CTE vs 子查询性能：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 可能较慢的 CTE 写法
WITH user_stats AS (
    SELECT user_id, COUNT(*) as order_count
    FROM orders GROUP BY user_id
)
SELECT u.name, us.order_count
FROM users u JOIN user_stats us ON u.id = us.user_id;

-- 通常更快的子查询写法
SELECT u.name, us.order_count
FROM users u JOIN (
    SELECT user_id, COUNT(*) as order_count
    FROM orders GROUP BY user_id
) us ON u.id = us.user_id;
</code></pre></div></div>

<p><strong>索引策略差异：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- PostgreSQL 不会自动为外键创建索引
-- 需要手动创建
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);

-- 复合索引顺序很重要
CREATE INDEX idx_orders_status_created ON orders(status, created_at);
-- 支持 (status), (status, created_at) 查询
</code></pre></div></div>

<h4 id="3-连接管理">3. 连接管理</h4>

<p><strong>连接池配置：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># pgbouncer 配置
[databases]
mydb = host=127.0.0.1 port=5432 pool_size=100

[pgbouncer]
pool_mode = transaction  # 事务级连接池
max_client_conn = 1000
</code></pre></div></div>

<hr />

<h3 id="常见陷阱与解决方案">常见陷阱与解决方案</h3>

<h4 id="1-函数和存储过程滥用">1. 函数和存储过程滥用</h4>

<p><strong>问题：</strong> 将过多业务逻辑放入数据库函数</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 避免：复杂的嵌套函数
CREATE OR REPLACE FUNCTION complex_business_logic()
RETURNS TABLE(...) AS $$
BEGIN
    -- 大量内存操作和递归调用
    -- 影响数据库性能
END;
$$ LANGUAGE plpgsql;
</code></pre></div></div>

<p><strong>解决方案：</strong></p>

<ul>
  <li>保持函数简单，标记为 <code class="language-plaintext highlighter-rouge">IMMUTABLE</code> 或 <code class="language-plaintext highlighter-rouge">STABLE</code></li>
  <li>复杂逻辑移回应用层</li>
  <li>使用触发器时限制数量</li>
</ul>

<h4 id="2-触发器性能问题">2. 触发器性能问题</h4>

<p><strong>最佳实践：</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 每个表最多一个 BEFORE 和一个 AFTER 触发器
CREATE OR REPLACE FUNCTION before_orders()
RETURNS TRIGGER AS $$
BEGIN
    -- 所有逻辑集中在一个函数中
    IF TG_OP = 'INSERT' THEN
        -- 插入逻辑
    ELSIF TG_OP = 'UPDATE' THEN
        -- 更新逻辑
    END IF;
    RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;
</code></pre></div></div>

<h4 id="3-notify-机制限制">3. NOTIFY 机制限制</h4>

<p><strong>问题：</strong> 大量 NOTIFY 事件影响性能</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 替代方案：事件队列表
CREATE TABLE event_queue (
    id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id uuid NOT NULL,
    type text NOT NULL,
    data jsonb NOT NULL,
    created_at timestamptz NOT NULL DEFAULT now(),
    acquired_at timestamptz
);

-- 批量处理事件
UPDATE event_queue 
SET acquired_at = now() 
WHERE id IN (
    SELECT id FROM event_queue 
    WHERE acquired_at IS NULL 
    ORDER BY created_at 
    FOR UPDATE SKIP LOCKED 
    LIMIT 1000
) RETURNING *;
</code></pre></div></div>

<h4 id="4-null-值处理差异">4. NULL 值处理差异</h4>

<p><strong>问题：</strong> <code class="language-plaintext highlighter-rouge">IS NOT DISTINCT FROM</code> 不使用索引</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 避免：不使用索引
SELECT * FROM users WHERE email IS NOT DISTINCT FROM 'test@example.com';

-- 推荐：显式 NULL 检查
SELECT * FROM users 
WHERE (email IS NULL AND 'test@example.com' IS NULL) 
   OR email = 'test@example.com';
</code></pre></div></div>

<hr />

<h3 id="扩展生态系统">扩展生态系统</h3>

<h4 id="核心扩展推荐">核心扩展推荐</h4>

<h5 id="1-性能监控扩展">1. 性能监控扩展</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 安装关键监控扩展
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
CREATE EXTENSION IF NOT EXISTS pg_qualstats;
CREATE EXTENSION IF NOT EXISTS pg_wait_sampling;

-- 查看慢查询
SELECT query, total_time, calls, mean_time
FROM pg_stat_statements 
ORDER BY total_time DESC 
LIMIT 10;
</code></pre></div></div>

<h5 id="2-数据模型扩展">2. 数据模型扩展</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- JSONB 文档存储
CREATE TABLE products (
    id serial PRIMARY KEY,
    metadata jsonb,
    created_at timestamptz DEFAULT now()
);

-- 创建 GIN 索引支持复杂查询
CREATE INDEX idx_products_metadata ON products USING gin (metadata);

-- 查询示例
SELECT * FROM products 
WHERE metadata @&gt; '{"category": "electronics", "price": {"$gt": 500}}';
</code></pre></div></div>

<h5 id="3-时序数据扩展">3. 时序数据扩展</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- TimescaleDB 超表
SELECT create_hypertable('sensor_data', 'timestamp');

-- 自动分区和压缩
ALTER TABLE sensor_data SET (
    timescaledb.compress,
    timescaledb.compress_orderby = 'timestamp DESC'
);
</code></pre></div></div>

<h4 id="外部数据包装器-fdw">外部数据包装器 (FDW)</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 连接其他 PostgreSQL 实例
CREATE EXTENSION postgres_fdw;

CREATE SERVER remote_server 
FOREIGN DATA WRAPPER postgres_fdw 
OPTIONS (host 'remote-host', port '5432', dbname 'remote_db');

-- 联邦查询
SELECT l.name, r.amount 
FROM local_customers l 
JOIN remote_orders r ON l.id = r.customer_id;
</code></pre></div></div>

<hr />

<h3 id="监控与诊断">监控与诊断</h3>

<h4 id="1-关键监控指标">1. 关键监控指标</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 数据库健康检查
SELECT 
    datname,
    numbackends as connections,
    xact_commit + xact_rollback as transactions,
    blks_read + blks_hit as total_blocks,
    round(blks_hit * 100.0 / (blks_hit + blks_read), 2) as cache_hit_ratio
FROM pg_stat_database 
WHERE datname = current_database();
</code></pre></div></div>

<h4 id="2-表膨胀监控">2. 表膨胀监控</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 监控表膨胀
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size,
    n_dead_tup,
    n_live_tup,
    round(n_dead_tup * 100.0 / (n_live_tup + n_dead_tup), 2) as dead_ratio
FROM pg_stat_user_tables 
WHERE n_dead_tup &gt; 0
ORDER BY dead_ratio DESC;
</code></pre></div></div>

<h4 id="3-索引使用情况">3. 索引使用情况</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 检查未使用的索引
SELECT 
    schemaname,
    tablename,
    indexname,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes 
WHERE idx_tup_read = 0 
  AND idx_tup_fetch = 0;
</code></pre></div></div>

<hr />

<h3 id="迁移策略建议">迁移策略建议</h3>

<h4 id="1-分阶段迁移计划">1. 分阶段迁移计划</h4>

<p><strong>阶段一：基础设施准备</strong></p>

<ul>
  <li>设置 PostgreSQL 集群</li>
  <li>配置监控和备份</li>
  <li>建立开发/测试环境</li>
</ul>

<p><strong>阶段二：数据迁移</strong></p>

<ul>
  <li>使用 <code class="language-plaintext highlighter-rouge">pgloader</code> 或自定义脚本</li>
  <li>验证数据完整性</li>
  <li>性能基准测试</li>
</ul>

<p><strong>阶段三：应用适配</strong></p>

<ul>
  <li>修改 SQL 查询语法</li>
  <li>调整连接池配置</li>
  <li>更新监控指标</li>
</ul>

<h4 id="2-关键迁移工具">2. 关键迁移工具</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 使用 pgloader 迁移
pgloader mysql://user:pass@mysql-host/dbname \
         postgresql://user:pass@pg-host/dbname

# 使用 ora2pg 从 Oracle 迁移（也可用于 MySQL）
ora2pg -c config/ora2pg.conf
</code></pre></div></div>

<h4 id="3-性能验证">3. 性能验证</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 创建测试环境
CREATE DATABASE test_migration;

-- 运行性能基准
\timing on
EXPLAIN ANALYZE SELECT * FROM large_table WHERE indexed_column = 'value';

-- 对比迁移前后的性能指标
</code></pre></div></div>

<hr />

<h3 id="总结与建议">总结与建议</h3>

<h4 id="核心要点">核心要点</h4>

<ol>
  <li><strong>MVCC 差异</strong>：PostgreSQL 的追加式 MVCC 需要更仔细的监控和管理</li>
  <li><strong>扩展生态</strong>：充分利用 PostgreSQL 的扩展机制，避免多数据库架构</li>
  <li><strong>性能调优</strong>：重点关注 <code class="language-plaintext highlighter-rouge">work_mem</code>、<code class="language-plaintext highlighter-rouge">shared_buffers</code> 和 autovacuum 配置</li>
  <li><strong>监控先行</strong>：建立完善的监控体系，特别是表膨胀和索引使用情况</li>
</ol>

<h4 id="迁移检查清单">迁移检查清单</h4>

<ul>
  <li>配置合适的 <code class="language-plaintext highlighter-rouge">work_mem</code> 和 <code class="language-plaintext highlighter-rouge">shared_buffers</code></li>
  <li>设置 autovacuum 参数</li>
  <li>为外键创建索引</li>
  <li>安装关键监控扩展</li>
  <li>建立表膨胀监控</li>
  <li>配置连接池</li>
  <li>设置备份和恢复策略</li>
  <li>建立性能基准测试</li>
</ul>

<h4 id="长期维护建议">长期维护建议</h4>

<ol>
  <li><strong>定期监控</strong>：每周检查表膨胀和索引使用情况</li>
  <li><strong>性能调优</strong>：根据实际负载调整参数</li>
  <li><strong>扩展评估</strong>：定期评估新的扩展和功能</li>
  <li><strong>团队培训</strong>：确保团队了解 PostgreSQL 特有的概念和最佳实践</li>
</ol>

<hr />

<h3 id="故障排除指南">故障排除指南</h3>

<h4 id="常见问题与解决方案">常见问题与解决方案</h4>

<h5 id="1-连接问题">1. 连接问题</h5>

<p><strong>问题：无法连接到 PostgreSQL</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 检查服务状态
sudo systemctl status postgresql

# 检查端口是否监听
sudo netstat -tlnp | grep 5432

# 检查配置文件
sudo nano /etc/postgresql/15/main/postgresql.conf
# 确保 listen_addresses = '*'

# 检查认证配置
sudo nano /etc/postgresql/15/main/pg_hba.conf
# 确保有正确的连接规则

# 重启服务
sudo systemctl restart postgresql
</code></pre></div></div>

<p><strong>问题：认证失败</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 检查用户是否存在
SELECT usename FROM pg_user WHERE usename = 'your_username';

-- 重置密码
ALTER USER your_username WITH PASSWORD 'new_password';

-- 检查用户权限
\du your_username
</code></pre></div></div>

<h5 id="2-性能问题">2. 性能问题</h5>

<p><strong>问题：查询缓慢</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 启用查询统计
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- 查看慢查询
SELECT 
    query,
    calls,
    total_time,
    mean_time,
    rows,
    100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements 
ORDER BY mean_time DESC 
LIMIT 10;

-- 分析查询计划
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM your_table WHERE condition;
</code></pre></div></div>

<p><strong>问题：表膨胀严重</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 检查表膨胀
SELECT 
    schemaname,
    tablename,
    n_dead_tup,
    n_live_tup,
    round(n_dead_tup * 100.0 / (n_live_tup + n_dead_tup), 2) as dead_ratio
FROM pg_stat_user_tables 
WHERE n_dead_tup &gt; 0
ORDER BY dead_ratio DESC;

-- 手动执行 vacuum
VACUUM ANALYZE your_table;

-- 如果膨胀严重，使用 pg_repack
-- pg_repack -d your_database -t your_table
</code></pre></div></div>

<h5 id="3-锁问题">3. 锁问题</h5>

<p><strong>问题：查询被阻塞</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 查看当前锁
SELECT 
    blocked_locks.pid AS blocked_pid,
    blocked_activity.usename AS blocked_user,
    blocking_locks.pid AS blocking_pid,
    blocking_activity.usename AS blocking_user,
    blocked_activity.query AS blocked_statement,
    blocking_activity.query AS current_statement_in_blocking_process
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
    AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
    AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
    AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
    AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
    AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
    AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
    AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
    AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
    AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

-- 终止阻塞的查询
SELECT pg_terminate_backend(blocked_pid);
</code></pre></div></div>

<h5 id="4-磁盘空间问题">4. 磁盘空间问题</h5>

<p><strong>问题：磁盘空间不足</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 检查数据库大小
SELECT 
    datname,
    pg_size_pretty(pg_database_size(datname)) as size
FROM pg_database
ORDER BY pg_database_size(datname) DESC;

-- 检查表大小
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_stat_user_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

-- 清理 WAL 日志（谨慎操作）
-- 首先检查 WAL 日志大小
SELECT pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0'));

-- 手动切换 WAL 日志
SELECT pg_switch_wal();
</code></pre></div></div>

<h5 id="5-配置问题">5. 配置问题</h5>

<p><strong>问题：参数配置错误</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 查看当前配置
SELECT name, setting, unit, context, short_desc 
FROM pg_settings 
WHERE name IN ('shared_buffers', 'work_mem', 'effective_cache_size');

-- 修改配置
ALTER SYSTEM SET shared_buffers = '256MB';
SELECT pg_reload_conf();

-- 查看配置是否生效
SHOW shared_buffers;
</code></pre></div></div>

<h4 id="监控脚本">监控脚本</h4>

<h5 id="系统健康检查脚本">系统健康检查脚本</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash
# postgres_health_check.sh

echo "=== PostgreSQL Health Check ==="
echo "Date: $(date)"
echo

# 检查服务状态
echo "1. Service Status:"
systemctl is-active postgresql

# 检查连接数
echo -e "\n2. Connection Status:"
psql -U postgres -c "
SELECT 
    datname,
    numbackends as current_connections,
    max_connections,
    round(numbackends * 100.0 / max_connections, 2) as usage_percent
FROM pg_stat_database 
JOIN pg_database ON pg_stat_database.datname = pg_database.datname
WHERE datname NOT IN ('template0', 'template1', 'postgres');
"

# 检查表膨胀
echo -e "\n3. Table Bloat Check:"
psql -U postgres -c "
SELECT 
    schemaname,
    tablename,
    n_dead_tup,
    n_live_tup,
    round(n_dead_tup * 100.0 / (n_live_tup + n_dead_tup), 2) as dead_ratio
FROM pg_stat_user_tables 
WHERE n_dead_tup &gt; 0 AND n_live_tup + n_dead_tup &gt; 1000
ORDER BY dead_ratio DESC
LIMIT 10;
"

# 检查慢查询
echo -e "\n4. Slow Queries:"
psql -U postgres -c "
SELECT 
    query,
    calls,
    mean_time,
    total_time
FROM pg_stat_statements 
ORDER BY mean_time DESC 
LIMIT 5;
"

# 检查锁
echo -e "\n5. Lock Status:"
psql -U postgres -c "
SELECT 
    mode,
    count(*) as lock_count
FROM pg_locks 
GROUP BY mode
ORDER BY lock_count DESC;
"

echo -e "\n=== Health Check Complete ==="
</code></pre></div></div>

<h5 id="性能监控脚本">性能监控脚本</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash
# postgres_performance_monitor.sh

LOG_FILE="/var/log/postgres_performance.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')

echo "[$DATE] Performance Check" &gt;&gt; $LOG_FILE

# 检查缓存命中率
CACHE_HIT=$(psql -U postgres -t -c "
SELECT round(100.0 * sum(blks_hit) / (sum(blks_hit) + sum(blks_read)), 2)
FROM pg_stat_database 
WHERE datname NOT IN ('template0', 'template1', 'postgres');
")

echo "[$DATE] Cache Hit Ratio: $CACHE_HIT%" &gt;&gt; $LOG_FILE

# 检查活跃连接
ACTIVE_CONNECTIONS=$(psql -U postgres -t -c "
SELECT count(*) FROM pg_stat_activity WHERE state = 'active';
")

echo "[$DATE] Active Connections: $ACTIVE_CONNECTIONS" &gt;&gt; $LOG_FILE

# 检查数据库大小
DB_SIZE=$(psql -U postgres -t -c "
SELECT pg_size_pretty(sum(pg_database_size(datname)))
FROM pg_database 
WHERE datname NOT IN ('template0', 'template1', 'postgres');
")

echo "[$DATE] Total Database Size: $DB_SIZE" &gt;&gt; $LOG_FILE

# 检查 WAL 日志大小
WAL_SIZE=$(psql -U postgres -t -c "
SELECT pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0'));
")

echo "[$DATE] WAL Size: $WAL_SIZE" &gt;&gt; $LOG_FILE

echo "[$DATE] Performance Check Complete" &gt;&gt; $LOG_FILE
echo "---" &gt;&gt; $LOG_FILE
</code></pre></div></div>

<h4 id="紧急恢复程序">紧急恢复程序</h4>

<h5 id="数据库恢复">数据库恢复</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 1. 停止 PostgreSQL 服务
sudo systemctl stop postgresql

# 2. 备份当前数据目录
sudo cp -r /var/lib/postgresql/15/main /var/lib/postgresql/15/main.backup.$(date +%Y%m%d_%H%M%S)

# 3. 从备份恢复
sudo -u postgres pg_restore -d your_database /path/to/backup.dump

# 4. 启动服务
sudo systemctl start postgresql

# 5. 验证数据
psql -U postgres -d your_database -c "SELECT count(*) FROM your_table;"
</code></pre></div></div>

<h5 id="配置恢复">配置恢复</h5>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># 恢复配置文件
sudo cp /etc/postgresql/15/main/postgresql.conf.backup /etc/postgresql/15/main/postgresql.conf
sudo cp /etc/postgresql/15/main/pg_hba.conf.backup /etc/postgresql/15/main/pg_hba.conf

# 重启服务
sudo systemctl restart postgresql
</code></pre></div></div>

<hr />

<h3 id="总结与最佳实践">总结与最佳实践</h3>

<h4 id="迁移检查清单-1">迁移检查清单</h4>

<h5 id="迁移前准备">迁移前准备</h5>

<ul>
  <li>评估现有 MySQL 数据库结构和数据量</li>
  <li>选择合适的 PostgreSQL 版本</li>
  <li>准备测试环境</li>
  <li>制定回滚计划</li>
  <li>培训团队成员</li>
</ul>

<h5 id="迁移过程">迁移过程</h5>

<ul>
  <li>安装和配置 PostgreSQL</li>
  <li>创建用户和权限</li>
  <li>迁移表结构</li>
  <li>迁移数据</li>
  <li>迁移存储过程和函数</li>
  <li>更新应用程序连接配置</li>
  <li>执行功能测试</li>
  <li>执行性能测试</li>
</ul>

<h5 id="迁移后优化">迁移后优化</h5>

<ul>
  <li>配置监控系统</li>
  <li>优化查询性能</li>
  <li>调整配置参数</li>
  <li>建立备份策略</li>
  <li>制定维护计划</li>
</ul>

<h4 id="关键成功因素">关键成功因素</h4>

<ol>
  <li><strong>充分测试</strong>：在迁移前进行全面的功能测试和性能测试</li>
  <li><strong>渐进迁移</strong>：考虑分阶段迁移，降低风险</li>
  <li><strong>监控先行</strong>：建立完善的监控体系</li>
  <li><strong>团队培训</strong>：确保团队了解 PostgreSQL 的特性和最佳实践</li>
  <li><strong>文档维护</strong>：保持配置和流程文档的更新</li>
</ol>

<h4 id="长期维护建议-1">长期维护建议</h4>

<ol>
  <li><strong>定期监控</strong>：每周检查系统健康状态</li>
  <li><strong>性能调优</strong>：根据实际负载调整配置参数</li>
  <li><strong>版本升级</strong>：制定 PostgreSQL 版本升级计划</li>
  <li><strong>扩展评估</strong>：定期评估新的扩展和功能</li>
  <li><strong>安全审计</strong>：定期进行安全配置审计</li>
</ol>

<p>通过遵循本指南，高级后端开发人员可以更顺利地完成从 MySQL 到 PostgreSQL 的迁移，并充分发挥 PostgreSQL 的强大功能。记住，迁移是一个持续的过程，需要不断的监控、优化和改进。</p>]]></content><author><name>gaoxingliang</name></author><category term="迁移自CSDN" /><category term="postgresql" /><category term="mysql" /><category term="数据库" /><summary type="html"><![CDATA[PostgreSQL迁移指南：从MySQL到PostgreSQL的关键差异与实践 本文为熟悉MySQL但缺乏PostgreSQL经验的高级后端程序员提供全面的迁移参考。重点对比了两大数据库系统的核心架构差异： MVCC机制：MySQL使用增量存储，PostgreSQL采用整行复制，导致不同的表膨胀特性和性能表现 存储架构：PostgreSQL的统一存储引擎与MySQL的多引擎架构对比 数据类型：PostgreSQL提供更丰富的类型系统，包括JSONB、数组、范围类型等高级特性 指南还涵盖了SQL语法差异、性]]></summary></entry><entry><title type="html">[metabase]高级使用技巧1 geojson导入和图表，动态sql执行， 动态过滤，动态分组，动态列</title><link href="https://gaoxingliang.github.io/blog/2025/09/22/metabase-1-geojson-sql-151955230/" rel="alternate" type="text/html" title="[metabase]高级使用技巧1 geojson导入和图表，动态sql执行， 动态过滤，动态分组，动态列" /><published>2025-09-22T02:19:40+00:00</published><updated>2025-09-22T02:19:40+00:00</updated><id>https://gaoxingliang.github.io/blog/2025/09/22/metabase-1-geojson-sql-151955230</id><content type="html" xml:base="https://gaoxingliang.github.io/blog/2025/09/22/metabase-1-geojson-sql-151955230/"><![CDATA[<h2 id="本文">本文</h2>

<p>本文是metbase的高级技巧分享，主要包括：geojson导入和图表，动态sql执行， 动态过滤，动态分组，动态列</p>

<h2 id="metabase中使用geojson">metabase中使用geojson</h2>

<p>为了呈现下图中的按地区归属的用户数信息：<br />
 <img src="/assets/images/posts/metabase-1-geojson-sql-151955230/img-001.png" alt="在这里插入图片描述" /></p>

<h3 id="操作步骤">操作步骤</h3>

<h4 id="1加入metabase-geo-json地址">1，加入metabase geo json地址：</h4>

<p>点击右上角admin settings-》settings -&gt; Maps –&gt; Add a map:<br />
 <img src="/assets/images/posts/metabase-1-geojson-sql-151955230/img-002.png" alt="在这里插入图片描述" /><br />
 这里需要一个url，我们可以从安利云的datav里面获取：<br />
 访问： <a href="https://datav.aliyun.com/portal/school/atlas/area_selector">https://datav.aliyun.com/portal/school/atlas/area_selector</a><br />
 选择合适的地理位置（比如我只关心四川省的）<br />
 <img src="/assets/images/posts/metabase-1-geojson-sql-151955230/img-003.png" alt="在这里插入图片描述" /></p>

<p>拷贝这里的geojson的url即可：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>https://geo.datav.aliyun.com/areas_v3/bound/geojson?code=510000_full
</code></pre></div></div>

<h4 id="2创建相关的dashboard">2、创建相关的dashboard</h4>

<p>假设我有一个user表：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>create table user(
id int primary key,
province_id int,
city_id int,
name varchar
);
</code></pre></div></div>

<p>如下的查询统计用户地区及用户数：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>select city_id, count(*) from user
where province_id = 510000 and city_id is not null
GROUP BY  city_id
</code></pre></div></div>

<p>选择visulation -》 maps， 然后勾选region map为前面创建的即可<br />
 <img src="/assets/images/posts/metabase-1-geojson-sql-151955230/img-004.png" alt="在这里插入图片描述" /></p>

<h2 id="复杂sql执行">复杂SQL执行</h2>

<p>在metabase中原生不支持多语句的sql执行<a href="https://github.com/metabase/metabase/issues/4050">gitub讨论</a>，但是如果真的是较复杂的业务 需要使用到sql编写的场景，则依赖手动修改实现。<br />
 本文描述了一种方法让metabase支持多行sql执行，并获取最后一个结果集作为最终结果集。</p>

<p>该方法修改了mysql底层的mariadb driver以支持多语句执行效果：<a href="https://github.com/gaoxingliang/mariadb-connector-j">github</a></p>

<p>主要代码在：<a href="https://github.com/gaoxingliang/mariadb-connector-j/blob/metabase-2.7.10/src/main/java/org/mariadb/jdbc/ProxyedSqlComponent.java">ProxyedSqlComponent.java</a><br />
 然后<br />
 （1）替换metabase.jar 中的相关driver类即可：<br />
 已经构建好的镜像：（使用方式同：https://blog.csdn.net/scugxl/article/details/150004029）</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker pull edwardg/metabase:v0.54.9.1
</code></pre></div></div>

<p>示例的dockerfile：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>FROM metabase/metabase:v0.54.9

# 安装zip工具
USER root
RUN apk update &amp;&amp; apk add --no-cache zip

# 创建临时工作目录
WORKDIR /tmp

COPY mariadb-java-client-2.7.10.jar mariadb-java-client-2.7.10.jar

# 使用单层RUN命令进行所有操作
RUN unzip /app/metabase.jar -d metabase-extracted &amp;&amp; \
    unzip /tmp/mariadb-java-client-2.7.10.jar -d mariadb-extracted &amp;&amp; \
    cp -r mariadb-extracted/org/mariadb/jdbc/Driver* metabase-extracted/org/mariadb/jdbc/ &amp;&amp; \
    cp -r mariadb-extracted/org/mariadb/jdbc/ProxyedSqlComponent* metabase-extracted/org/mariadb/jdbc/ &amp;&amp; \
    cd metabase-extracted &amp;&amp; \
    zip -r /app/metabase.jar ./* &amp;&amp; \
    rm -rf /tmp/metabase-extracted /tmp/mariadb-extracted /tmp/mariadb-java-client-2.7.10.jar

# 恢复工作目录
WORKDIR /app
</code></pre></div></div>

<p>（2）在database configuration中加入 <code class="language-plaintext highlighter-rouge">allowMultiQueries=true</code><br />
 <img src="/assets/images/posts/metabase-1-geojson-sql-151955230/img-005.png" alt="在这里插入图片描述" /></p>

<h3 id="复杂sql测试">复杂sql测试</h3>

<p>假设我添加了1个filter，指定不同的group by， 然后按group by auth status 的话 就有 auth status, count(id) 2列， 如果group by org_id 的话就有province_id, city_id, count(id) 3列， <strong>注意到我们这里实现了一个不同的groupby有不同列的效果，都是动态的</strong>， 那么sql如下：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 动态排序（添加ORDER BY clause）
SET @group_by_field = ;

-- 拼接动态SQL（包含过滤和排序）
SET @sql = CONCAT(
        'SELECT ',
        CASE @group_by_field
            WHEN 'org_id' THEN '`province_id` AS `省`, `city_id` AS `市`, count(*) AS `count`'
            WHEN 'auth_status' THEN '`auth_status` AS `认证状态`,  count(*) AS `count`'
            END,
        ' FROM `user` ',
    -- 动态过滤
    -- 动态分组
        'GROUP BY ',
        CASE @group_by_field
            WHEN 'org_id' THEN '1,2'
            WHEN 'auth_status' THEN '1'
            END
    );

-- 预处理SQL（将@sql转换为可执行的语句）
PREPARE stmt FROM @sql;

-- 执行预处理语句
EXECUTE stmt
</code></pre></div></div>

<p>那么效果如下：<br />
 <img src="/assets/images/posts/metabase-1-geojson-sql-151955230/img-006.png" alt="在这里插入图片描述" /><br />
 <img src="/assets/images/posts/metabase-1-geojson-sql-151955230/img-007.png" alt="在这里插入图片描述" /><br />
 通过这样的效果，你可以实现任意复杂的sql统计和filter的关联组合。</p>

<h2 id="总结">总结</h2>

<p>在本文我们介绍了geojson和分享了多语句sql在metabase中的一种hack方式。 这里是<a href="https://blog.csdn.net/scugxl/article/details/150003515?spm=1001.2014.3001.5501">metabase系列文章</a>的高级技巧部分。 下个博客我们介绍如何分享dashboard.</p>]]></content><author><name>gaoxingliang</name></author><category term="迁移自CSDN" /><category term="metabase" /><category term="bi" /><category term="可视化" /><category term="数据可视化" /><category term="数据库" /><summary type="html"><![CDATA[本文是metbase的高级技巧分享，主要包括：geojson导入和图表，动态sql执行， 动态过滤，动态分组，动态列]]></summary></entry></feed>