一尘不染

提取正则表达式匹配项的一部分

html

我想要一个正则表达式从HTML页面提取标题。目前我有这个:

title = re.search('<title>.*</title>', html, re.IGNORECASE).group()
if title:
    title = title.replace('<title>', '').replace('</title>', '')

是否有一个正则表达式仅提取的内容,所以我不必删除标签?</p> </div> <br> <span>阅读 546 </span> <br><br> <div class="ui button"> <i class="remove bookmark icon"></i> 收藏 </div> 2020-05-10 </div> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-5812992211126268" crossorigin="anonymous"></script> <!-- 问答横向 --> <ins class="adsbygoogle" style="display:block" data-ad-client="ca-pub-5812992211126268" data-ad-slot="2602551727" data-ad-format="auto" data-full-width-responsive="true"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> <h2> 共1个答案</h2> <div class="ui segment" style="margin-bottom:20px;"> <div class="stackable"> <strong>一尘不染</strong> <br/><br/> <p>用<code>(``)</code>在正则表达式和<code>group(1)</code>python中检索捕获的字符串(<code>re.search</code>将返回<code>None</code>如果没有找到结果,所以<br /> <em>不要用<code>group()</code>直接</em>):</p> <pre><code>title_search = re.search('<title>(.*)</title>', html, re.IGNORECASE) if title_search: title = title_search.group(1) </code></pre> <div style="font-size:12px"> <span>2020-05-10 </span> </div> </div> </div> </div> <div class="col-md-4 bd-toc"> <div class="ui segment"> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-5812992211126268" crossorigin="anonymous"></script> <!-- 问答纵向 --> <ins class="adsbygoogle" style="display:block" data-ad-client="ca-pub-5812992211126268" data-ad-slot="9167960078" data-ad-format="auto" data-full-width-responsive="true"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <div class="ui segment"> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-5812992211126268" crossorigin="anonymous"></script> <!-- 问答纵向 --> <ins class="adsbygoogle" style="display:block" data-ad-client="ca-pub-5812992211126268" data-ad-slot="9167960078" data-ad-format="auto" data-full-width-responsive="true"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> </div> </div> </div> <!-- <script src="https://readmore.openwrite.cn/js/readmore.js" type="text/javascript"></script> <script> const btw = new BTWPlugin(); btw.init({ id: 'container', blogId: '19336-1647660365813-119', name: 'Golang技术栈', qrcode: 'https://codingdict.com/static/assets/images/qrcode.jpg', keyword: '666', }); </script> --> <footer class="es-footer"> <div class="copyright"> <div class="container"> Powered by <a href="http://www.codingdict.com/" target="_blank">CodingDict</a> ©2014-2020 <a class="mlm" href="http://www.codingdict.com/" target="_blank">编程字典</a> <a class="mlm" href="http://www.codingdict.com/courses">课程存档</a> <div class="mts"> 课程内容版权均归 <a href="http://www.codingdict.com/"> CodingDict </a> 所有 <a class="mlm" href="https://beian.miit.gov.cn/" target="_blank"> 京ICP备18030172号 </a> <span> 商务合作:15011039890(微信手机同号)</span> </div> </div> </div> </footer> <script type="text/javascript" src="/static/plugins/js/jquery.min.js"></script> <script type="text/javascript" src="/static/assets/js/bootstrap.min.js"></script> <script type="text/javascript" src="/static/plugins/js/ace.js"></script> <script type="text/javascript" src="/static/plugins/js/resizable.min.js"></script> <script type="text/javascript" src="/static/plugins/js/semantic.min.js"></script> <script type="text/javascript" src="/static/plugins/js/emojis.min.js"></script> <script type="text/javascript" src="/static/plugins/js/highlight.min.js"></script> <script type="text/javascript" src="/static/martor/js/martor.min.js"></script> </div> </body> </html>