translated

This commit is contained in:
geekpi 2023-03-03 08:46:28 +08:00
parent c5e09293a7
commit 83ceb98eda
2 changed files with 108 additions and 107 deletions

View File

@ -1,107 +0,0 @@
[#]: subject: "3 tips to manage large Postgres databases"
[#]: via: "https://opensource.com/article/23/2/manage-large-postgres-databases"
[#]: author: "Elizabeth Garrett Christensen https://opensource.com/users/elizabethchristensencrunchydatacom"
[#]: collector: "lkxed"
[#]: translator: "geekpi"
[#]: reviewer: " "
[#]: publisher: " "
[#]: url: " "
3 tips to manage large Postgres databases
======
The relational database PostgreSQL (also known as Postgres) has grown increasingly popular, and enterprises and public sectors use it across the globe. With this widespread adoption, databases have become larger than ever. At Crunchy Data, we regularly work with databases north of 20TB, and our existing databases continue to grow. My colleague David Christensen and I have gathered some tips about managing a database with huge tables.
### Big tables
Production databases commonly consist of many tables with varying data, sizes, and schemas. It's common to end up with a single huge and unruly database table, far larger than any other table in your database. This table often stores activity logs or time-stamped events and is necessary for your application or users.
Really large tables can cause challenges for many reasons, but a common one is locks. Regular maintenance on a table often requires locks, but locks on your large table can take down your application or cause a traffic jam and many headaches. I have a few tips for doing basic maintenance, like adding columns or indexes, while avoiding long-running locks.
**Adding indexes problem**: Index creation locks the table for the duration of the creation process. If you have a massive table, this can take hours.
```
CREATE INDEX ON customers (last_name)
```
**Solution**: Use the **CREATE INDEX CONCURRENTLY** feature. This approach splits up index creation into two parts, one with a brief lock to create the index that starts tracking changes immediately but minimizes application blockage, followed by a full build-out of the index, after which queries can start using it.
```
CREATE INDEX CONCURRENTLY ON customers (last_name)
```
### Adding columns
Adding a column is a common request during the life of a database, but with a huge table, it can be tricky, again, due to locking.
**Problem**: When you add a new column with a default that calls a function, Postgres needs to rewrite the table. For big tables, this can take several hours.
**Solution**: Split up the operation into multiple steps with the total effect of the basic statement, but retain control of the timing of locks.
Add the column:
```
ALTER TABLE all_my_exes ADD COLUMN location text
```
Add the default:
```
ALTER TABLE all_my_exes ALTER COLUMN location
SET DEFAULT texas()
```
Use **UPDATE** to add the default:
```
UPDATE all_my_exes SET location = DEFAULT
```
### Adding constraints
**Problem**: You want to add a check constraint for data validation. But if you use the straightforward approach to adding a constraint, it will lock the table while it validates all of the existing data in the table. Also, if there's an error at any point in the validation, it will roll back.
```
ALTER TABLE favorite_bands
ADD CONSTRAINT name_check
CHECK (name = 'Led Zeppelin')
```
**Solution**: Tell Postgres about the constraint but don't validate it. Validate in a second step. This will take a short lock in the first step, ensuring that all new/modified rows will fit the constraint, then validate in a separate pass to confirm all existing data passes the constraint.
Tell Postgres about the constraint but do not to enforce it:
```
ALTER TABLE favorite_bands
ADD CONSTRAINT name_check
CHECK (name = 'Led Zeppelin') NOT VALID
```
Then **VALIDATE** it after it's created:
```
ALTER TABLE favorite_bands VALIDATE CONSTRAINT name_check
```
### Hungry for more?
David Christensen and I will be in Pasadena, CA, at SCaLE's Postgres Days, March 9-10. Lots of great folks from the Postgres community will be there too. Join us!
--------------------------------------------------------------------------------
via: https://opensource.com/article/23/2/manage-large-postgres-databases
作者:[Elizabeth Garrett Christensen][a]
选题:[lkxed][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/elizabethchristensencrunchydatacom
[b]: https://github.com/lkxed/

View File

@ -0,0 +1,108 @@
[#]: subject: "3 tips to manage large Postgres databases"
[#]: via: "https://opensource.com/article/23/2/manage-large-postgres-databases"
[#]: author: "Elizabeth Garrett Christensen https://opensource.com/users/elizabethchristensencrunchydatacom"
[#]: collector: "lkxed"
[#]: translator: "geekpi"
[#]: reviewer: " "
[#]: publisher: " "
[#]: url: " "
管理大型 Postgres 数据库的 3 个技巧
======
关系型数据库 PostgreSQL也被称为 Postgres已经越来越流行全球各地的企业和公共部门都在使用它。随着这种广泛的采用数据库已经变得比以前更大了。在 Crunchy Data我们经常与 20TB 以上的数据库打交道,而且我们现有的数据库还在继续增长。我的同事 David Christensen 和我收集了一些关于管理拥有巨大表的数据库的技巧。
### 大表
生产数据库通常由许多具有不同数据、大小和模式的表组成。常见的情况是,最终有一个巨大的、无序的数据库表,远远大于你数据库中的任何其他表。这个表经常存储活动日志或有时间戳的事件,并对你的应用或用户来说是必要的。
真正的大表会因为很多原因造成挑战,但一个常见的原因是锁。对表的定期维护往往需要锁,但对大表的锁可能会使你的应用瘫痪,或导致堵塞和许多令人头痛的问题。我有一些做基本维护的技巧,比如添加列或索引,同时避免长期运行的锁。
**添加索引的问题**:索引的创建会在创建过程中锁住表。如果你有一个庞大的表,这可能需要几个小时。
```
CREATE INDEX ON customers (last_name)
```
**方案**:使用 **CREATE INDEX CONCURRENTLY** 功能。这种方法将索引创建分成两部分,一部分是短暂的锁定,以创建索引,立即开始跟踪变化,但尽量减少应用阻塞,然后是完全建立索引,之后查询可以开始使用它。
```
CREATE INDEX CONCURRENTLY ON customers (last_name)
```
### 添加列
在数据库的使用过程中,添加列是一个常见的请求,但是对于一个巨大的表来说,这可能是很棘手的,同样是由于锁的问题。
**问题**当你添加一个新的默认列并调用一个函数时Postgres 需要重写表。对于大表,这可能需要几个小时。
**Solution**: Split up the operation into multiple steps with the total effect of the basic statement, but retain control of the timing of locks.
**方案**:将操作拆分为多条基本语句,总效果一致,但保留对锁的时间控制。
添加列:
```
ALTER TABLE all_my_exes ADD COLUMN location text
```
添加默认值:
```
ALTER TABLE all_my_exes ALTER COLUMN location
SET DEFAULT texas()
```
使用 **UPDATE** 来添加默认值:
```
UPDATE all_my_exes SET location = DEFAULT
```
### 添加约束条件
**问题**: 你想添加一个用于数据验证的检查约束。但是如果你使用直接的方法来添加约束,它将锁定表,同时验证表中的所有现有数据。另外,如果在验证的任何时候出现错误,它将回滚。
```
ALTER TABLE favorite_bands
ADD CONSTRAINT name_check
CHECK (name = 'Led Zeppelin')
```
**方案**:告诉 Postgres 这个约束,但不要验证它。在第二步中进行验证。这将在第一步中进行短暂的锁定,确保所有新的/修改过的行都符合约束条件,然后在另一步骤中进行验证,以确认所有现有的数据都通过约束条件。
告诉 Postgres 这个约束,但不要强制执行它:
```
ALTER TABLE favorite_bands
ADD CONSTRAINT name_check
CHECK (name = 'Led Zeppelin') NOT VALID
```
然后在创建后**验证**它:
```
ALTER TABLE favorite_bands VALIDATE CONSTRAINT name_check
```
### 想了解更多?
David Christensen 和我将在 3 月 9 号到 10 到在加州帕萨迪纳参加 SCaLE 的 Postgres Days。很多来自 Postgres 社区的优秀人士也会在那里。加入我们吧!
--------------------------------------------------------------------------------
via: https://opensource.com/article/23/2/manage-large-postgres-databases
作者:[Elizabeth Garrett Christensen][a]
选题:[lkxed][b]
译者:[geekpi](https://github.com/geekpi)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/elizabethchristensencrunchydatacom
[b]: https://github.com/lkxed/