一尘不染

无论如何,我无法在Hibernate中批处理MySQL INSERT语句

mysql

我目前面临着众所周知的常见Hibernate插入批处理问题。

我需要保存500万行的批次。我首先尝试减轻负载。由于我只需要插入2种类型的实体(首先是类型A的所有记录,然后是类型B的所有记录,都指向公共类型C的ManyToOne父级),所以我想从JDBC批处理插入中获得最大的好处。

我已经阅读了许多文档,但是没有一个尝试过。

  • 我知道 为了使用批处理插入,我一定不能使用实体生成器。因此,我删除了AUTO_INCREMENTID,并使用一个技巧设置了ID:SELECT MAX(ID) FROM ENTITIES并且每次都递增。
  • 我知道 我必须定期刷新会话。我将代码提前发布,但是无论如何,我每500个元素执行一次事务。
  • 我知道 我必须设置hibernate.jdbc.batch_size与我的应用程序的批量大小一致,所以我在LocalSessionFactoryBean(Spring ORM集成)中进行了设置
  • 我知道 我必须启用重写连接URL中的批处理语句。

这是我的实体

共同的父实体。这首先插入到单个事务中。我不在 这里 自动递增列。每批作业仅 一个 记录

@Entity
@Table(...)
@SequenceGenerator(...)
public class Deal
{

    @Id
    @Column(
            name = "DEAL_ID",
            nullable = false)
    @GeneratedValue(
            strategy = GenerationType.AUTO)
    protected Long id;

    ................
}

其中一个孩子(假设每批250万条记录)

@Entity
@Table(
        name = "TA_LOANS")
public class Loan
{

    @Id
    @Column(
            name = "LOAN_ID",
            nullable = false)
    protected Long id;

    @ManyToOne(
            optional = false,
            targetEntity = Deal.class,
            fetch = FetchType.LAZY)
    @JoinColumn(
            name = "DEAL_ID",
            nullable = false)
    protected Deal deal;


    .............
}

其他孩子类型。假设其他250万条记录

@Entity
@Table(
        name = "TA_BONDS")
public class Bond
{

    @Id
    @Column(
            name = "BOND_ID")

    @ManyToOne(
            fetch = FetchType.LAZY,
            optional = false,
            targetEntity = Deal.class)
    @JoinColumn(
            name = "DEAL_ID",
            nullable = false,
            updatable = false)
    protected Deal deal;

}

插入记录的简化代码

    long loanIdCounter = loanDao.getMaxId(), bondIdCounter = bondDao.getMaxId(); //Perform SELECT MAX(ID)

    Deal deal = null;

    List<Bond> bondList = new ArrayList<Bond>(COMMIT_BATCH_SIZE); //500 constant value
    List<Loan> loanList = new ArrayList<Loan>(COMMIT_BATCH_SIZE);

    for (String msg: inputStreamReader)
    {
        log.debug(msg.toString());

        if (this is a deal)
        {
            Deal deal = parseDeal(msg.getMessage());

            deal = dealManager.persist(holder.deal); //Called in a separate transaction using Spring annotation @Transaction(REQUIRES_NEW)

        }
        else if (this is a loan)
        {

            Loan loan = parseLoan(msg.getMessage());
            loan.setId(++loanIdCounter);
            loan.setDeal(deal);

            loanList.add(loan);

            if (loanList.size() == COMMIT_BATCH_SIZE)
            {
                loanManager.bulkInsert(loanList); //Perform a bulk insert in a single transaction, not annotated but handled manually this time
                loanList.clear();
            }
        }
        else if (this is a bond)
        {
            Bond bond = parseBond(msg.getMessage());
            bond.setId(++bondIdCounter);
            bond.setDeal(deal);

            bondList.add(bond);



            if (bondList.size() == COMMIT_BATCH_SIZE) //As above
            {
                bondManager.bulkInsert(bondList);
                bondList.clear();

            }
        }
    }

    if (!bondList.isEmpty())
        bondManager.bulkInsert(bondList);
    if (!loanList.isEmpty())
        loanManager.bulkInsert(loanList);
    //Flush remaining items, not important

实施bulkInsert

@Override
public void bulkInsert(Collection<Bond> bonds)
{
    // StatelessSession session = sessionFactory.openStatelessSession();
    Session session = sessionFactory.openSession();
    try
    {
        Transaction t = session.beginTransaction();
        try
        {
            for (Bond bond : bonds)
                // session.persist(bond);
                // session.insert(bond);
                session.save(bond);
        }
        catch (RuntimeException ex)
        {
            t.rollback();
        }
        finally
        {
            t.commit();
        }
    }
    finally
    {
        session.close();
    }

}

从评论中可以看到,我尝试了有状态/无状态的几种组合session。没有工作。

dataSourceComboPooledDataSource具有以下网址的

<b:property name="jdbcUrl" value="jdbc:mysql://server:3306/db?autoReconnect=true&amp;rewriteBatchedStatements=true" />

我的 SessionFactory

<b:bean id="sessionFactory" class="class.that.extends.org.springframework.orm.hibernate3.LocalSessionFactoryBean" lazy-init="false" depends-on="dataSource">
        <b:property name="dataSource" ref="phoenixDataSource" />
        <b:property name="hibernateProperties">
            <b:props>
                <b:prop key="hibernate.dialect">${hibernate.dialect}</b:prop> <!-- MySQL5InnoDb-->
                <b:prop key="hibernate.show_sql">${hibernate.showSQL}</b:prop>
                <b:prop key="hibernate.jdbc.batch_size">500</b:prop>
                <b:prop key="hibernate.jdbc.use_scrollable_resultset">false</b:prop>
                <b:prop key="hibernate.cache.use_second_level_cache">false</b:prop>
                <b:prop key="hibernate.cache.provider_class">org.hibernate.cache.EhCacheProvider</b:prop>
                <b:prop key="hibernate.cache.use_query_cache">false</b:prop>
                <b:prop key="hibernate.validator.apply_to_ddl">false</b:prop>
                <b:prop key="hibernate.validator.autoregister_listeners">false</b:prop>
                <b:prop key="hibernate.order_inserts">true</b:prop>
                <b:prop key="hibernate.order_updates">true</b:prop>
            </b:props>
        </b:property>
</b:bean>

即使我的项目范围的类extends LocalSessionFactoryBean,它 也不会 覆盖其方法(仅添加一些项目范围的方法)

几天以来我发疯了。我读了几篇文章,但没有一篇文章使我能够批量插入。我从使用Spring上下文(因此可以使用@Autowire我的类)进行的JUnit测试运行所有代码。我所有的尝试只会产生很多单独的INSERT陈述

我想念什么?


阅读 194

收藏
2020-05-17

共1个答案

一尘不染

您的查询可能正在被重写,但是通过查看Hibernate SQL日志您不会知道是否。Hibernate不会重写插入语句-
MySQL驱动程序会重写它们。换句话说,Hibernate将向驱动程序发送多个插入语句,然后驱动程序将重写它们。因此,Hibernate日志仅显示向主机发送的SQL
Hibernate,而不是驱动程序向数据库发送的SQL。

您可以通过在连接URL中启用MySQL的profileSQL参数来验证这一点:

<b:property name="jdbcUrl" value="jdbc:mysql://server:3306/db?autoReconnect=true&amp;rewriteBatchedStatements=true&amp;profileSQL=true" />

使用类似于您的示例,这就是我的输出:

insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
insert into Person (firstName, lastName, id) values (?, ?, ?)
Wed Feb 05 13:29:52 MST 2014 INFO: Profiler Event: [QUERY]  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) duration: 1 ms, connection-id: 81, statement-id: 33, resultset-id: 0, message: insert into Person (firstName, lastName, id) values ('person1', 'Name', 1),('person2', 'Name', 2),('person3', 'Name', 3),('person4', 'Name', 4),('person5', 'Name', 5),('person6', 'Name', 6),('person7', 'Name', 7),('person8', 'Name', 8),('person9', 'Name', 9),('person10', 'Name', 10)

Hibernate记录了前10行,尽管实际上不是发送到MySQL数据库。最后一行来自MySQL驱动程序,它清楚地显示了具有多个值的单个批处理插入,这实际上是发送到MySQL数据库的内容。

2020-05-17