논리학, 철학, 수학 등을 다루는 블로그입니다.
아래에서 저의 가장 최근 글들을 읽을 수 있습니다.
To read in English, toggle the button in the header.

A blog on logic, philosophy, mathematics, et cetra.
You can read my most recent posts below.

양자역학의 국소성에 대하여

23 Feb 2026

물리학

양자역학

벨 부등식은 흔히 “양자역학은 국소적이지 않다”는 결론으로 요약되곤 한다. 그러나 이 요약은 오해를 일으키는 표현인데, 국소성의 표준적인 의미로 따지면 양자역학은 완벽히 국소적이기 때문이다. 국소성의 표준적 의미란 다음과 같다.

정의. 어떤 물리 이론이 국소적_local일 조건은, 시공간의 두 사건_event에 대해 한 사건이 다른 사건에 빛보다 빠르게 영향을 주지 않는 것이다.

그리고 위의 정의를 적절히 해석했을 때 양자역학은 국소적임을 증명할 수 있다. 가령, 두 개의 얽힌 입자 A와 B로 이루어진 계의 파동함수 $\psi(a, b)$를 생각하자. 만약 우리가 입자 A에만 관심을 가진다면, 입자 A를 대상으로 한 관측의 결과가 어떻게 나올지는 다음의 밀도 행렬_{density matrix}로 완전히 표현할 수 있다.

\[\rho_{a'a} = \sum_{b} \psi^*(a, b)\psi(a', b)\]

예를 들어 관측량_observable $\mathbf{L}$이 A에 국소적으로 작용할 때, $\mathbf{L}$의 기댓값은 다음과 같이 구할 수 있음이 알려져 있다.

\[\langle \mathbf{L} \rangle = \operatorname{tr} \rho L\]

여기서 $\mathbf{L}$이 A에 국소적으로 작용한다는 표현의 의미는 다음과 같다.

\[L_{a'b', ab} = L_{a'a} \delta_{b'b}\]

따라서 입자 B에서 국소적으로 일어나는 시간 변화가 입자 A의 밀도 행렬을 변화시키지 않는다면, 다음의 의미에서 양자역학은 국소적이다.

임의의 두 입자에 대해, 한 입자의 국소적 시간 변화가 다른 입자의 국소적 상태에 빛보다 빠르게 영향을 주지 않는다.

이 사실을 증명해 보자. 입자 B에서 국소적 시간 변화가 진행된다고 하자. 가령 입자 B의 위치를 관측하거나, 운동량을 관측하거나, 입자 B를 다른 입자와 얽히게 만드는 등의 일이 일어난다. 무슨 일이 일어나든 간에 이 일은 $\psi(a, b)$를 변화시킬 것이다. 양자역학에서 계의 시간에 따른 변화는 유니터리 행렬로 표현된다. 즉, 어떤 유니터리 연산자 $\mathbf{U}$가 존재하여, 일정 시간이 지난 이후 계의 상태를 $\psi_f(a’, b’)$라고 하면

\[\psi_f(a', b') = \sum_{a, b} U_{a'b', ab} \psi(a, b)\]

이다. 그런데 시간 변화가 B에서만 국소적으로 진행된다면 $a \neq a’$일 때 $U_{a’b’, ab} = 0$이다. 따라서,

\[\psi_f(a, b') = \sum_{b} U_{b'b} \psi(a, b)\]

로 줄여 쓸 수 있다. 이로 인해 A의 밀도 행렬은 다음과 같이 갱신된다.

\[\rho'_{a'a} = \sum_{b} \psi_f^*(a, b)\psi(a', b)\]

두 식을 연립하면,

\[\begin{align} \rho'_{a'a} &= \sum_{b} \left( \left( \sum_{b'} U_{bb'} \psi(a, b') \right)^* \left( \sum_{b''} U_{bb''} \psi(a', b'') \right) \right) \\ &= \sum_{b, b', b''} U^*_{b'b} U_{bb''} \psi^*(a, b') \psi(a', b'') \end{align}\]

이다. 그런데 $\mathbf{U}$가 유니터리이므로 $\sum_b U^\ast_{b’b} U_{bb’’} = (U^\ast U)_{b’ b’’} = \delta_{b’ b’’}$이다. 따라서,

\[\rho'_{a'a} = \sum_{b'} \psi^*(a, b') \psi(a', b') = \rho_{a'a}\]

이다. 따라서 $\rho’ = \rho$이다. 즉, B에서 국소적으로 진행되는 유니터리한 시간 변화는 A의 밀도 행렬을 변화시키지 않는다. 실질적으로 이는 다음의 정리를 증명한 것이다.

통신 없음 정리_{No-communication theorem}. (얽힘을 고려하더라도) 빛보다 빠르게 정보를 전달하는 것은 불가능하다.

그렇다면 벨 부등식을 비롯한, 흔히 양자역학의 비국소성을 보여주는 것으로 제시되는 사례들은 어떻게 이해되어야 하는가? 가령, 코펜하겐 해석에 따르면 각운동량의 총합이 0인 두 전자를 앨리스와 밥이 나눠가진 뒤, 밥이 스핀 업의 전자를 관측한다면, 앨리스의 전자는 그 즉시 스핀 다운으로 붕괴된다. 그렇다면 앨리스의 전자의 밀도 행렬도 그에 맞추어 변해야 하는 것 아닌가?

이 질문에 대한 “실증주의적” 해답은 이렇다 (실증주의적이라는 표현은 이 해답에 대한 철학적 해석을 일단 차치하겠다는 것이다). 만약 두 개의 전자뿐 아니라 앨리스와 밥 또한 전체 계의 일부로 간주한다면, 밥의 관측이 있기 전과 후 계의 파동함수는 다음과 변화한다.

관측 전:

\[\begin{align} \psi &= \frac{1}{\sqrt{2}}(\text{Alice has not observed her spin-up electron}\\ &\qquad \land \,\,\, \text{Bob has not observed his spin-down electron}) \\ &+ \frac{1}{\sqrt{2}}(\text{Alice has not observed her spin-down electron} \\ &\qquad \land \,\,\, \text{Bob has not observed his spin-up electron}) \end{align}\]

관측 후:

\[\begin{align} \psi_f &= \frac{1}{\sqrt{2}}(\text{Alice has not observed her spin-up electron}\\ &\qquad \land \,\,\, \text{Bob has observed his spin-down electron}) \\ &+ \frac{1}{\sqrt{2}}(\text{Alice has not observed her spin-down electron} \\ &\qquad \land \,\,\, \text{Bob has observed his spin-up electron}) \end{align}\]

즉, 밥도 계의 일부이기 때문에 밥의 행위는 계를 불연속적으로 붕괴시키는 것이 아닌, 계의 유니터리한 시간 변화의 일부이다. 대신 앨리스-밥 계의 외부에서 제3자인 찰리가 밥을 관측한다면 그때는 $\psi_f$가 가령 다음과 같이 붕괴할 것이다.

\[\begin{align} \psi_f \longmapsto\; &(\text{Alice has not observed her spin-down electron} \\ &\land \; \text{Bob has observed his spin-up electron}) \end{align}\]

그러나 만약 찰리 또한 밥과 앨리스와 함께 계의 일부로 간주한다면, 전체 계의 파동함수는 다음과 같이 유니터리하게 시간 변화한다.

\[\begin{align} \psi_f &= \frac{1}{\sqrt{2}}(\text{Alice has not observed her spin-up electron}\\ &\qquad \land \,\,\, \text{Bob has observed his spin-down electron} \\ &\qquad \land \,\,\, \text{Charlie has observed Bob observing spin-down}) \\ &+ \frac{1}{\sqrt{2}}(\text{Alice has not observed her spin-down electron} \\ &\qquad \land \,\,\, \text{Bob has observed his spin-up electron} \\ &\qquad \land \,\,\, \text{Charlie has observed Bob observing spin-up}) \\ \end{align}\]

일반적으로, 관측자와 계를 분리하면 관측은 계의 파동함수를 붕괴시키지만, 관측자를 계에 포함시키면 관측 또한 계의 유니터리 시간 변화의 일부이다. 이 결론을 확장하면, 우주 전체의 파동함수는 항상 유니터리하게 시간 변화하며, 우주 전체의 파동함수만이 유일하게 물리적으로 유의미한 파동함수라는 주장을 얻는다. 이는 양자역학에 대한 다세계 해석으로 이어지게 된다.

On the Locality of Quantum Mechanics

23 Feb 2026

Physics

Quantum Mechanics

This post was originally written in Korean, and has been machine translated into English. It may contain minor errors or unnatural expressions. Proofreading will be done in the near future.

The Bell inequality is often summarised as concluding that “quantum mechanics is non-local.” However, this summary is misleading, as quantum mechanics is perfectly local when judged by the standard definition of locality. The standard definition of locality is as follows:

Definition. A physical theory is local if, for any two events in spacetime, one event does not influence the other faster than the speed of light.

When interpreted appropriately, quantum mechanics can be shown to be local under this definition. For instance, consider a system consisting of two entangled particles, A and B, with a wavefunction $\psi(a, b)$. If we are only interested in particle A, the outcomes of measurements on A can be fully described by the following density matrix:

\[\rho_{a'a} = \sum_{b} \psi^*(a, b)\psi(a', b)\]

For example, when an observable $\mathbf{L}$ acts locally on A, the expectation value of $\mathbf{L}$ is known to be given by:

\[\langle \mathbf{L} \rangle = \operatorname{tr} \rho L\]

Here, the expression “observable $\mathbf{L}$ acts locally on A” means the following:

\[L_{a'b', ab} = L_{a'a} \delta_{b'b}\]

Thus, if local changes in time at particle B do not affect the density matrix of particle A, quantum mechanics is local in the following sense:

For any two particles, local changes in time at one particle do not influence the local state of the other particle faster than the speed of light.

Let us prove this. Suppose a local change in time occurs at particle B, such as measuring its position, momentum, or entangling it with another particle. Whatever happens, this will alter $\psi(a, b)$. In quantum mechanics, the time evolution of a system is represented by a unitary matrix. That is, there exists a unitary operator $\mathbf{U}$ such that, after some time, the state of the system becomes $\psi_f(a’, b’)$:

\[\psi_f(a', b') = \sum_{a, b} U_{a'b', ab} \psi(a, b)\]

If the time evolution is local to B, then $U_{a’b’, ab} = 0$ when $a \neq a’$. Hence,

\[\psi_f(a, b') = \sum_{b} U_{b'b} \psi(a, b)\]

This leads to the following update of A’s density matrix:

\[\rho'_{a'a} = \sum_{b} \psi_f^*(a, b)\psi(a', b)\]

Combining the two equations, we have:

Since $\mathbf{U}$ is unitary, $\sum_b U^\ast_{b’b} U_{bb’’} = (U^\ast U)_{b’ b’’} = \delta_{b’ b’’}$. Therefore,

\[\rho'_{a'a} = \sum_{b'} \psi^*(a, b') \psi(a', b') = \rho_{a'a}\]

Thus, $\rho’ = \rho$. In other words, unitary time evolution local to B does not alter the density matrix of A. This effectively proves the following theorem:

No-communication theorem. (Even considering entanglement) It is impossible to transmit information faster than the speed of light.

How, then, should we understand cases like the Bell inequality, which are often presented as evidence of the non-locality of quantum mechanics? For instance, according to the Copenhagen interpretation, if Alice and Bob share two electrons with a total angular momentum of zero, and Bob observes a spin-up electron, Alice’s electron instantaneously collapses to spin-down. Does this not imply that Alice’s electron’s density matrix changes accordingly?

The “positivist” answer to this question is as follows (the term “positivist” is used here to set aside philosophical interpretations for now). If not only the two electrons but also Alice and Bob are considered part of the entire system, the wavefunction of the system evolves as follows before and after Bob’s observation:

Before observation:

After observation:

In other words, since Bob is part of the system, his actions are not a discontinuous collapse of the system but part of its unitary time evolution. Instead, if a third party, Charlie, observes Bob, then $\psi_f$ might collapse as follows:

\[\begin{align} \psi_f \longmapsto\; &(\text{Alice has not observed her spin-down electron} \\ &\land \; \text{Bob has observed his spin-up electron}) \end{align}\]

However, if Charlie is also considered part of the system along with Bob and Alice, the wavefunction of the entire system evolves unitarily as follows:

In general, if the observer and the system are separated, observation collapses the system’s wavefunction, but if the observer is included in the system, observation is part of the system’s unitary time evolution. Extending this conclusion, the wavefunction of the entire universe always evolves unitarily, and only the wavefunction of the entire universe is physically meaningful. This leads to the many-worlds interpretation of quantum mechanics.

강제법 개론

28 Jan 2026

수학

집합론

논리학

괴델은 1938년에 구성가능성 공리를 도입함으로써 연속체 가설의 부정이 ZFC에서 증명 불가능함을 보였다. 구성가능 전체 $L$이 연속체 가설이 성립하는 ZFC의 모델이기 때문이다. 이후 1963년에 코헨이 강제법을 개발하여, 연속체 가설이 성립하지 않는 ZFC의 모델을 발견했다. 이로써 연속체 가설은 ZFC와 독립적임이 보여졌다. 이 글에서는 코헨의 강제법과, 연속체 가설의 독립성 증명의 개요를 살펴본다. 미리 말해두자면 이 글의 증명들은 다소 엄밀하지 않으며 아이디어 위주로 정리했으니, 엄밀한 논증은 전공 서적(Kunen)을 통해 확인하길 바란다.

당연한 말이지만, 이 글에서는 ZFC가 무모순하다고 가정할 것이다.

1. 가산 추이적 모델

$V$가 누적 위계_{cumulative hierarchy}라고 하자. $(V, \in)$은 ZFC의 “모델”이다.

정리. 다음의 성질을 만족하는 $V$의 부분모델_submodel $(M, \in)$가 존재한다.

$M$은 ZFC의 모델이다.

$M$은 가산이다.

$M$은 추이적_transitive이다. 즉, 임의의 $x \in M$과 $y \in V$에 대해,

\[V \vDash y \in x \implies V \vDash y \in M\]

증명. 다음 정리를 사용한다.

모스토프스키 붕괴 정리_{Mostowski collapse lemma}. 모임_class $X$ 위의 이항 관계 $R$이 다음을 만족한다고 하자.

크기 제한_smallness: $R^{-1}(x) = \lbrace y \in X : yRx \rbrace $는 집합이다.

기초성_{well-foundedness}: 공집합이 아닌 $X$의 부분집합 $S$는 $R$-극소 원소를 가진다.

외연성_{extensionality}: $x = y$일 필요충분조건은 $R^{-1}(x) = R^{-1}(y)$이다.

이때, 어떤 $V$의 추이적 모델 $(M, \in)$이 유일하게 존재하여 $(M, \in)$과 $(X, R)$이 동형이다.

이 정리의 증명은 비교적 자명하나 쓸 게 많으므로 Hrbacek & Jech, 14장을 대신 참고로 달아놓는다.

하향 뢰벤하임-스콜렘 정리에 의해 $(V, \in)$의 가산 부분모델 $(N, \in)$이 존재한다. 크기 제한, 기초성, 외연성은 모두 1차 논리로 표현 가능한 성질이므로, $(V, \in)$이 모스토프스키 붕괴 정리의 조건들을 만족한다는 사실로부터 $(N, \in)$ 또한 만족한다는 사실이 따라 나온다. 따라서 $(N, \in)$과 동형인 추이적 모델 $(M, \in)$이 존재한다. $N, M$이 동형이므로 $M$은 가산이다. ■

따라서 $V$는 가산 추이적 부분모델 $M$을 가진다. $M$의 가산성과 추이성은 나중에 강제법에서 중요하게 쓰일 것이다.

2. 표기에 관해

앞서 $V$가 모델이라고 할 때 큰따옴표를 쳤는데, 그 이유는 엄밀히 말해 모델은 집합이어야 하기 때문이다. 즉, $V$는 모델이 되기에 너무 크다. 때문에 방금 증명에서 한 것처럼 $V$에 대해 뢰벤하임-스콜렘 정리를 마냥 사용할 수는 없다.

그러나 이 문제는 $V$를 $V_\kappa$ ($\kappa$는 도달 불가능 기수), 또는 (ZFC의 무모순성을 가정했을 때 괴델의 완전성 정리로부터 보장되는) ZFC의 다른 집합-모델로 바꿔치기 함으로써 해결할 수 있다. 따라서 이 글에서는 편의를 위해 $V$를 모델(집합)처럼 다룰 것이지만, 이것이 불편한 독자는 이 글의 모든 $V$를 $V_\kappa$로 바꿔치기 하면 된다.

추가적으로 다음 규약을 사용하겠다.

규약.

“집합론의 모델”을 “ZFC의 모델”과 동의어로 사용한다.

$V \vDash \phi$를 간단히 $\phi$라고 적는다. 즉, 모든 명제의 배경은 $V$이다.

3. 상대화

$M$이 집합론의 추이적 모델이라고 하자. ZFC는 단 하나의 이항 관계 $\in$를 가지는 언어의 이론이기 때문에 임의의 $x, y \in M$에 대해 다음은 보장이 되지만, (앞선 규약에 따라 우변의 $x \in y$는 $V \vDash x \in y$임을 유의하라)

\[M \vDash x \in y \iff x \in y\]

이외의 관계에서는 이것이 자동적으로 보장되지는 않는다. 예를 들어 $x, z \in M$에 대해,

\[M \vDash y = \mathcal{P}(x)\]

이지만

\[y \neq \mathcal{P}(x)\]

일 수 있다.¹ 따라서 ($V$에서의) 명제 $\phi$와 $M$ “내부”에서 본 명제 $\phi$를 구별할 필요가 있다. 이를 위해 다음을 정의한다.

정의. 명제 $\phi$에 대해, $\phi^M$을 $M$에 대해 상대화_{relativization}한 $\phi$라고 부르며, 다음과 같이 재귀적으로 정의한다.

$x, y \in M$에 대해 $(x \in y)^M$은 $x \in y$와 같다.

$x, y \in M$에 대해 $(x = y)^M$은 $x = y$와 같다.

$(\phi \land \psi)^M$은 $\phi^M \land \psi^M$과 같다.

$(\phi \lor \psi)^M$은 $\phi^M \lor \psi^M$과 같다.

$(\lnot \phi)^M$은 $\lnot \phi^M$과 같다.

$(\forall x:\phi)^M$은 $\forall x \in M: \phi^M$과 같다.

$(\exists x:\phi)^M$은 $\exists x \in M : \phi^M$과 같다.

즉, $M$에 대해 상대화한 명제는 기존의 명제에서 양화사의 범위를 모두 $M$으로 제한한 것이다.

Remark. 상대화의 정의로부터 $\phi^M \iff M \vDash \phi$이다. 때문에 이런 의문이 들 수 있다. 이미 $\vDash$가 있는데 왜 상대화를 정의하는 것일까? 그 이유는, $M \vDash \phi$는 (모델론의) 메타명제인 반면, $\phi^M$은 (집합론의) 명제이기 때문이다. 즉, 상대화를 사용하면 “$\phi$가 $M$에서 성립한다”는 사실을 $V$에서 표현할 수 있다. 이는 이후 강제 관계 $\Vdash$를 정의할 때 중요해질 것이다.

4. 절대성

정의. $M, N$이 집합론의 추이적 “모델”이고, $M$이 $N$의 부분”모델”이라고 하자. $\phi$가 $n$항 명제라고 하자. 임의의 $x_1, \dots, x_n \in M$에 대해,
\[\phi^M(x_1, \dots x_n) \leftrightarrow \phi^N(x_1, \dots, x_n)\]
가 성립한다면 $\phi$가 $M, N$ 사이에서 절대적_absolute이라고 한다. $N = V$인 경우, $V$에 대한 언급을 생략하여 $\phi$가 $M$ 위에서 절대적이라고 한다.

즉, $\phi$가 $M$ 위에서 절대적이라는 것은 $M$ 내부에서 본 $\phi$가, 전체에서 본 $\phi$와 일치한다는 것이다.

정의. $\exists x \in A, \forall x \in A$와 같이 범위가 특정 집합으로 한정되어 있는 양화사를 한정 양화사라고 한다. 모든 양화사가 한정 양화사인 명제를 $\Delta_0$ 명제라고 한다.

예를 들어 $y \subseteq x$의 정의는 다음과 같은데, $\forall z$가 $y$로 한정되어 있으므로 $\Delta_0$ 명제이다.

\[\forall z \in y (z \in x)\]

정리.

절대적 명제들의 합성은 절대적이다.

$\Delta_0$ 명제는 추이적 모델 사이에서 절대적이다.

증명. 명제에 대한 귀납법으로 증명한다. ■

따름정리. 다음 명제들은 추이적 모델 사이에서 절대적이다.

$y \subseteq x$

$f$는 함수이다, $f$는 전사이다, $f$는 단사이다

명제에 대한 절대성을 확장하여, 집합에 대한 절대성을 정의할 수 있다. 예를 들어 $\omega$는 가장 작은 귀납적 집합으로 정의된다. 구체적으로,

\[\mathrm{IsInductive}(x) : 0 \in x \land \forall y( y \in x \rightarrow y\cup \{y\} \in x)\]

로 정의하고,

\[\mathrm{IsOmega}(x) : \mathrm{IsInductive}(x) \land \forall z\Big(\mathrm{IsInductive}(z) \rightarrow z \subseteq x \Big)\]

로 정의하자. 이때 $\mathrm{IsOmega}$를 만족하는 집합이 유일하게 존재함을 보일 수 있으며, 그 집합을 $\omega$로 정의한다.

이와 같이 집합 $s$의 정의가 명제 $\phi$로 주어지는 경우², 만약 $\phi$가 $M, N$ 사이에서 절대적이라면, 집합 $s$가 $M, N$ 사이에서 절대적이라고 한다. 예를 들어 위의 따름정리로부터 $\mathrm{IsOmega}$가 추이적 모델 사이에서 절대적임을 알 수 있다. 따라서 $\omega$는 추이적 모델 사이에서 절대적이다.

따름정리. 다음 집합들은 추이적 모델 사이에서 절대적이다.

$\varnothing$

$x \times y, \lbrace x, y \rbrace , (x, y)$

$\bigcup x, \bigcap x$

$x$가 순서쌍들의 집합일 때, $\operatorname{dom} x$

$\omega\; (=\aleph_0)$

절대적이지 않은 명제/집합들의 예시를 보자.

정리. 다음 명제/집합들은 추이적 모델 사이에서 절대적이지 않다.

$\mathcal{P}(x)$

$x$는 가산이다.

$x$는 기수이다.

$\aleph_n \; (n > 0)$

$\mathcal{P}(x)$가 절대적이지 않은 이유를 훑어보자. $\mathcal{P}(x)$의 정의는 다음과 같다.

\[y = \mathcal{P}(x) : \underbrace{\forall z \subseteq x (z \in y)}_{(1)} \land \underbrace{\forall z \in y (z \subseteq x)}_{(2)}\]

(2)는 $\Delta_0$이기 때문에 괜찮지만, (1)이 문제다. (1)은 한정 양화사인 척을 하고 있지만, 사실 풀어쓰면 다음과 같다.

\[(1) : \forall z (z \subseteq x \rightarrow z \in y)\]

즉, (1)의 양화사는 비한정이다. 따라서 $\mathcal{P}(x)$는 $\Delta_0$가 아니며, 실제로도 절대적이지 않다. 2, 3, 4에 또한 $\Delta_0$로 표현되지 않음을 확인할 수 있다.

5. 강제 부분순서집합

드디어 강제법과 관련된 용어들이 등장하기 시작한다.

정의. $(\mathbb{P}, \leq, 1)$이 강제 부분순서집합_{forcing poset}이라는 것은 $(\mathbb{P}, \leq)$가 $1$을 극대 원소로 가지는 부분순서집합이라는 것이다.

강제법의 맥락에서 $p \leq q$는 보통 $p$가 $q$보다 “자세하다”, “엄격하다”, “까다롭다”, “일어날 확률이 작다” 등의 의미를 가진다. 또는, $p$는 $q$의 “확장”이다. 예를 들어,

$p:$ 주사위를 굴려서 나온 수가 4 이하의 짝수인 사건
$q:$ 주사위를 굴려서 나온 수가 짝수인 사건

라고 하자. $p$는 $q$보다 자세하고, 엄격하며, 까다롭고, 일어날 확률이 작은 사건이다. 또한 $p$는 조건이 추가된 $q$라는 점에서 $q$의 확장이기도 하다. 따라서 강제법의 맥락에서는 $p \leq q$로 순서를 주는 것이 자연스럽다.

조금 더 수학적인 예시로,

\[\mathrm{Fn}(\omega, \omega) = \{ f \; | \; f : \omega \to \omega \text{ is a partial function with finite domain}\}\]

에 대해 다음과 같이 순서를 주자.

\[f \leq g \iff f|_{\operatorname{dom}g} = g\]

예를 들어 $\lbrace (0, 1), (1, 2) \rbrace \leq \lbrace (1, 2) \rbrace $이다. 즉, $f \leq g$일 때 $f$는 $g$보다 “자세한” 함수이다. 다르게 말하면, $f$는 $g$의 확장이다. 또한 $\mathrm{Fn}(\omega, \omega)$는 $\varnothing$를 극대 원소로 가지므로, $\mathrm{Fn}(\omega, \omega)$는 강제 부분순서집합이다.

이후 특별한 언급이 없다면 $\mathbb{P}$는 강제 부분순서집합인 것으로 생각한다.

6. 필터

정의. $\mathbb{P}$의 부분집합 $\mathcal{F}$가 $\mathbb{P}$ 위의 필터_filter라는 것은 다음을 만족한다는 것이다.

$1 \in \mathcal{F}$

$x \in \mathcal{F}$이고 $x \leq y$라면, $y \in \mathcal{F}$이다.

$x, y \in \mathcal{F}$라면, 어떤 $z \in \mathcal{F}$가 존재하여 $z \leq x, z \leq y$이다.

필터에 대해서는 이미 이 블로그에서 여러 번 다루었는데, 필터의 핵심 성질은 3번이다. 즉, 필터의 임의의 두 원소는 공통 확장을 가진다. 이는 필터에 일종의 방향성을 준다. 위상수학에서는 이 방향성을 이용하여 필터를 점으로 수렴시키기도 하고, 모델론에서는 필터로부터 새로운 구조(초곱)를 구성하기도 한다. 이에 착안하여, 강제법은 필터로부터 새로운 집합론의 모델을 얻어낼 것이다.

예를 들어 앞선 $\mathbb{P} = \mathrm{Fn}(\omega, \omega)$의 예시로 돌아가자면, 전함수_{total function} $f: \omega \to \omega$에 대해 다음은 필터이다.

\[\mathcal{F}_f = \{ g \in \mathrm{Fn}(\omega, \omega) : f \leq g\}\]

특히, $\mathcal{F}_f$는 함수 $f$를 “표현”하는 필터라고 볼 수 있다. 구체적으로, 다음이 성립한다.

\[\bigcup \mathcal{F}_f = f\]

정리하자면, 부분함수_{partial function}들의 순서로부터 필터를 정의하여 전함수를 얻을 수 있다. 이는 다음의 흥미로운 가능성을 시사한다. 집합론의 모델 $M$에 대해, $\mathbb{P}$는 $M$의 원소이지만, $\mathbb{P}$의 필터 $\mathcal{F}$와 $\bigcup \mathcal{F}$는 $M$의 원소가 아닐 수 있다. 그렇다면 $M$에 $\mathcal{F}$를 “추가”하여, 새로운 집합론의 모델을 얻어봄직하다. 실제로 이것이 강제법의 아이디어이다.

조금 더 구체적으로 이후의 과정을 개괄해 보자. $M$이 가산 추이적인 $V$의 부분모델이라고 하자. 먼저 적절한 강제 부분순서집합 $\mathbb{P} \in M$과 필터 $\mathcal{F} \subseteq \mathbb{P}$를 정의할 것이다. 특히, $\mathcal{F}$와 $\mathbb{P}$가 특정 조건 — 구체적으로, $\mathcal{F}$는 제너릭이어야 하고 $\mathbb{P}$는 분리적이어야 한다 — 을 만족하게끔 정의하면 $\mathcal{F} \notin M$이 될 것이다. 이에 따라 $M$에 $\mathcal{F}$를 “추가”함으로써 새로운 집합론의 모델 $M[\mathcal{F}]$를 얻을 수 있다. 이는 체론에서 체 $F$에 원소 $a$를 추가하여 $F(a)$를 얻는 것과 개념적으로 비슷하다.

물론 $M[\mathcal{F}]$가 집합론의 모델이기 위해서는, $M$에 $\mathcal{F}$뿐 아니라, $\mathcal{F}$와 $M$의 원소들의 집합 연산으로부터 얻어질 수 있는 모든 집합들 또한 추가해야 한다. 이 또한 체론에서 $F$에 $a$를 추가하기 위해서는 $a^2, a + 1, 3a$ 등도 함께 추가해줘야 하는 것과 비슷하다. 이 과정을 일률적으로 달성하기 위해 $\mathbb{P}$-이름이라는 개념을 도입할 것이다.

7. 조밀성과 제너릭 필터

정의. $D \subseteq \mathbb{P}$가 다음을 만족할 때, $D$는 조밀_dense하다고 한다.
\[\forall x \in \mathbb{P} \; \exists y \in D : y \leq x\]

즉, $D$는 임의의 $x \in \mathbb{P}$보다 자세한 원소를 언제나 가지고 있다. 이는 마치 $A = \lbrace 0.9, 0.99, 0.999, …\rbrace $가 임의의 $x \in \mathbb{R}\setminus\lbrace 1\rbrace $보다 1에 가까운 원소를 언제나 가지고 있는 것과 비슷하다. 특히 $A$가 1 주변에서 조밀하다는 것을 생각해 보면, 위와 같은 이름이 붙은 이유를 이해할 수 있다. (여담: 스톤 이중성을 사용하여 이것보다 더 자연스러운 “조밀성”의 해석을 얻을 수 있다. 이에 관해서는 나중에 다뤄보겠다.)

다시 $\mathbb{P} = \mathrm{Fn}(\omega, \omega)$의 예시로 돌아가자면, 임의의 $n \in \omega$에 대해 다음의 $D_n$는 조밀하다.

\[D_n = \{ g \in \mathrm{Fn}(\omega, \omega) : n \in \operatorname{dom}g\}\]

왜냐하면 임의의 부분함수는 정의역이 $n$을 포함하도록 확장될 수 있기 때문이다.

한편 조밀한 $D$는 “흔한 성질”을 표현하는 것으로 이해될 수 있다. 예를 들어 앞서 정의한 $D_n$은, “정의역이 $n$을 포함함”이라는 성질을 표현하는데, 이는 “흔한” 성질이다. 왜냐하면 모든 부분함수는 정의역이 $n$을 포함하도록 확장될 수 있기 때문이다.

직관적으로 생각했을 때, $\mathbb{P}$ 위의 필터 $\mathcal{F}$가 $M$에 없는 새로운 원소이기를 바란다면 $\mathcal{F}$는 정의하기 어려워야 한다. 만약 $\mathcal{F}$가 $\phi$로 정의된다면, $\exists! x \lnot\phi(x)$는 ZFC의 정리일 것이고, 이에 따라 이미 $\mathcal{F} \in M$일 것이기 때문이다. 그리고 $\mathcal{F}$가 정의하기 어렵다는 것은 $\mathcal{F}$가 특출난 데가 없다는 의미이다. 즉, $\mathcal{F}$는 모든 흔한 특징을 가져야 한다.

이는 다음의 정의로 이어진다.

정의. $M$이 집합론의 추이적 모델이고, $\mathbb{P} \in M$이며, $G \subseteq \mathbb{P}$가 필터라고 하자 ($G \in M$일 필요는 없음). 임의의 $\mathbb{P}$의 조밀한 부분집합 $D \in M$에 대해 $D \cap \mathcal{F} \neq \varnothing$라면, $\mathcal{F}$가 $M$ 위에서 $\mathbb{P}$-제너릭_generic하다고 한다.

여기서 $D \cap \mathcal{F} \neq \varnothing$은 직관적으로 $\mathcal{F}$가 $D$라는 성질을 가진다는 의미이다. 제너릭의 정의는 배경이 되는 집합론의 모델 $M$에 상대적임에 유의하라.

구체적인 예시를 들어 보자. $\mathrm{id}:\omega \to \omega$가 항등함수라고 하자. 필터 $\mathcal{F}_{\mathrm{id}}$는 제너릭하지 않다. 왜냐하면

\[D = \{ f \in \mathrm{Fn}(\omega, \omega) \;|\; \exists n \in \operatorname{dom} f : f(n) \neq n \}\]

가 조밀함에도 불구하고 $D \cap \mathcal{F} = \varnothing$이기 때문이다. 이는 $\mathrm{id}$가 쉽게 정의 가능하며, $\mathrm{id} \in M$이라는 사실과 일맥상통한다.

그렇다면 역으로, 제너릭 필터는 정말 $M$의 원소가 아닐까? 이를 보장하기 위해서는 한 가지 조건이 더 필요하다.

정의. $x, y \in \mathbb{P}$에 대해 $z \leq x, z \leq y$를 만족하는 $z \in \mathbb{P}$가 없다면 $x$와 $y$가 양립 불가능_incompatible하다고 하며, $x \perp y$라고 적는다.

정의. $\mathbb{P}$가 분리적_separative이라는 것은, 임의의 $z \in \mathbb{P}$에 대해 어떤 $x \leq z, y \leq z$가 존재하여 $x \perp y$라는 것이다.

정리. $M$이 집합론의 추이적 모델이라고 하자. $\mathbb{P} \in M$이 분리적이고 $G \subseteq \mathbb{P}$가 $M$ 위의 $\mathbb{P}$-제너릭 필터라면 $G \notin M$이다.

증명. $G \in M$이라고 하자. 그러면 $D = \mathbb{P} \setminus G$는 $M$의 원소이다. $D$가 조밀함을 보이자. 만약 $D$가 조밀하지 않다면 어떤 $x \in \mathbb{P}$가 존재하여 $y \leq x$라면 $y \notin D$이다. 즉, $y \in G$이다. $\mathbb{P}$가 분리적이므로, $y_1 \perp y_2$인 $y_1, y_2 \leq x$가 존재한다. 따라서 $y_1, y_2 \in G$이다. 필터의 정의에 의해 어떤 $z \in \mathbb{P}$가 존재하여 $z \leq y_1, y_2$이다. 그런데 이는 $y_1 \perp y_2$에 모순된다.

따라서 $D$는 조밀하다. $G$는 제너릭하므로 $G \cap D \neq \varnothing$이어야 한다. 그런데 이는 $D$의 정의에 모순된다. 따라서 $G \notin M$이다. ■

제너릭 필터의 특징상, 제너릭 필터의 예시를 구체적으로 구성하기는 쉽지 않다. 대신 다음의 정리가 제너릭 필터의 존재성을 보장해 준다.

정의. $\mathcal{C}$가 $\mathbb{P}$에서 조밀한 집합들의 모임이라고 하자. 필터 $G \subseteq \mathbb{P}$가 $\mathcal{C}$-제너릭하다는 것은, 임의의 $D \in \mathcal{C}$에 대해 $D \cap G \neq \varnothing$이라는 것이다.

정리. $\mathcal{C}$가 $\mathbb{P}$에서 조밀한 집합들의 가산 모임이라면 $\mathcal{C}$-제너릭 필터가 존재한다.

증명. $\mathcal{C} = \lbrace D_n \rbrace _{n \in \omega}$라고 하자. 먼저 $x_0 \in D_0$를 선택한다. $x_n \in D_n$이 선택되었을 때, $x_{n+1} \in D_{n + 1}$을 $x_{n + 1} \leq x_n$을 만족하도록 선택한다 ($D_{n + 1}$이 조밀하므로 그러한 $x_{n+1}$은 언제나 존재한다). $G = \lbrace x_n \rbrace _{n \in \omega}$가 구하고자 하는 제너릭 필터이다. ■

따름정리. $M$이 가산 추이적 모델이라면 $M$ 위의 $\mathbb{P}$-제너릭 필터 $G$가 존재한다.

증명. $M$이 가산이므로 $\mathcal{C} = \lbrace D \in M : D \text{ is }\mathbb{P}\text{-dense}\rbrace $는 가산이다. ■

따라서 가산 추이적 모델에서는 언제나 $\mathbb{P}$-제너릭 필터를 상정할 수 있다. 이것이 도입부에서 가산성이 중요하다고 말한 이유이다.

(여담) 8. 마틴 공리

$\mathcal{C}$가 가산이 아닐 때에는 $\mathcal{C}$-제너릭 필터가 존재하지 않을 수 있다. 그러나 $\mathbb{P}$의 크기에 제한을 건다면, 가능한 조밀 집합들의 가능성이 한정되고, 이에 따라 비가산인 $\mathcal{C}$ 또한 제너릭 필터를 가지게 될 수 있다.

정의. $\mathbb{P}$의 부분집합 $A$가 반체인_antichain이라는 것은, 임의의 $x, y \in A$에 대해 $x \perp y$라는 것이다.

정의. $\mathbb{P}$가 가산 체인 성질_{countable chain condition}을 가진다는 것은 $\mathbb{P}$의 반체인이 기껏 가산이라는 것이다.

반체인과 가산 체인 성질에 대해서는 이 글에서 다룬 바 있는데, 그때 가산 체인 성질이 순서 집합(트리)의 “너비”를 제한한다고 설명했다.

정의. $\kappa$가 기수라고 하자. $\mathrm{MA}_\kappa$를 다음 진술로 정의한다: $\mathbb{P}$가 가산 체인 성질을 가지고, $\mathcal{C}$가 $\mathbb{P}$-조밀한 집합들의 모임이며, $|\mathcal{C}| \leq \kappa$라면, $\mathcal{C}$-제너릭 필터가 존재한다.

앞선 정리에 의해 $\mathrm{MA}_{\aleph_0}$는 자명하게 성립한다. 한편 $\mathrm{MA}_{\kappa}$는 $\kappa < 2^{\aleph_0}$를 시사하기 때문에 $\kappa \geq 2^{\aleph_0}$에 대해 $\mathrm{MA}_\kappa$는 성립하지 않는다.

마틴 공리_{Martin’s axiom}. 모든 $\kappa < 2^{\aleph_0}$에 대해 $\mathrm{MA}_{\kappa}$가 성립한다.

만약 연속체 가설이 참이라면 $\kappa < 2^{\aleph_0} \rightarrow \kappa = \aleph_0$이므로 마틴 공리가 성립한다. 그러나 마틴 공리와 연속체 가설의 부정이 무모순적임이 알려져 있다 (이 사실의 증명 또한 강제법을 사용한다. 개요: $2^{\aleph_0} = \aleph_2$이도록 강제한 다음, 추가로 $\mathrm{MA}_{\aleph_1}$을 강제한다.). 따라서 마틴 공리는 연속체 가설의 약화된 버전으로 생각될 수 있다.

9. $\mathbb{P}$-이름

$\mathbb{P}$가 분리적이고 $G$가 $\mathbb{P}$-제너릭할 때 $G \notin M$임을 보았다. 따라서 $M$에서는 $G$에 관해 진술할 수 있는 방법이 없는 것으로 보인다. 그러나 굉장히 기발한 트릭을 사용하면 $M$에서도 어느 정도 $G$에 관해 진술할 수 있다. 사실 이것이 강제법의 천재성이다.

아이디어는 다음과 같다. 이 글에서 필터는 “참”인 원소들의 모임으로 생각할 수 있다고 설명했다. 즉, $\mathbb{P}$를 사건들의 모임으로 본다면, $\mathcal{F}$는 실제로 일어난 사건들의 모임으로 생각할 수 있다.

이에 착안해, $\mathbb{P}$를 이용하여 원소를 확률적으로 가지는 집합이라는 개념을 떠올릴 수 있다. 예를 들어 $p \in \mathbb{P}$가 실제로 일어난다면 $a$를 원소로 가지는 집합 $X$는 다음과 같이 적는 것이다.

\[(a, p) \in X\]

필터 $G$는 실제로 일어난 사건들이 무엇인지를 명시한다. 만약 $p \in G$라면, $X$는 실제로 $a$를 원소로 가진다. 이를 다음과 같이 표현하자.

\[p \in G \iff a \in X_G\]

비유하자면, $X$는 양자 중첩 상태이고 $X_G$은 관측 후의 상태이다.

그런데 $X$와 같은 확률적 집합이 또다른 집합의 원소일 수도 있을 것이다. 거꾸로 말해, $(a, p) \in X$에서 $a$ 또한 확률적 집합일 수도 있다. 따라서 다음과 같이 적는 것이 정확하다.

\[p \in G \iff a_G \in X_G\]

이상의 논의를 엄밀하게 정리하면 $\mathbb{P}$-이름의 정의를 얻는다.

정의. 다음 꼴의 집합을 $\mathbb{P}$-이름_{$\mathbb{P}$-name}이라고 한다.
\[\sigma = \{ (\tau, p) : p \in \mathbb{P}, \tau \text{ is a }\mathbb{P}\text{-name}\}\]
$\mathbb{P}$-이름들 중 $M$에 속하는 이름들의 집합을 $M^\mathbb{P}$와 같이 적는다.

정의에 의해 $M^\mathbb{P} \subseteq M$이지만, $M^\mathbb{P} \in M$은 보장되지 않음에 유의하라.

정의. $G$가 $\mathbb{P}$ 위의 필터이고 $\sigma$가 $\mathbb{P}$-이름일 때, $G$에서 $\sigma$의 평가_valuation $\sigma_G$를 다음과 같이 정의한다.
\[\sigma_G = \{ \tau_G \;|\; \exists p \in G :(\tau, p) \in \sigma \}\]

위 정의는 순환 정의처럼 보이지만, 정확한 재귀 정의이다.[^3] 일례로 $\varnothing$이 $\mathbb{P}$-이름이며, 이에 따라 임의의 $p \in \mathbb{P}$에 대해 $\sigma = \lbrace (\varnothing, p) \rbrace $ 또한 $\mathbb{P}$-이름이다. 특히,

\[p \in G \iff \sigma_G = \{\varnothing \}\]

이다. 조금 더 복잡한 예시로, $p, q, r \in \mathbb{P}$일 때

\[\tau = \bigg\{ \Big(\big\{ ( \{ (\varnothing, r) \}, p), (\varnothing, q) \big\}, q\Big), (\varnothing, r) \bigg\}\]

또한 $\mathbb{P}$-이름이며, 특히 $q, r \in G, p \notin G$일 때 다음과 같음을 확인하라.

\[\tau_G = \{\{ \varnothing\}, \varnothing\}\]

이쯤에서 다음의 표기법을 도입하자.

\[\check{x} = \{(y, 1): y \in x\}\]

$1$은 $\mathbb{P}$의 극대 원소였음에 유의하라. 이로부터, 임의의 필터 $G$에 대해 $\check{x}_G = x$임을 알 수 있다. 즉, $\check{x}$는 $x$의 “이름표”이다.

앞서 $\mathbb{P}$-이름을 사용하면 $M \notin G$임에도 $M$ 내에서 $G$에 관해 어느 정도 진술할 수 있다고 했다. 이는 다음의 $\mathbb{P}$-이름이 $G$의 “이름표”이기 때문이다.

\[\Gamma = \{ (\check{p}, p) : p \in \mathbb{P} \}\]

$\Gamma$가 $G$의 이름표라는 말의 의미는, $\Gamma_G = G$가 성립한다는 것이다 (직접 확인해 보라). 그리고 이 점이 중요한데, $\mathbb{P} \in M$이므로 $\Gamma \in M$이다. 즉, $G$는 $M$의 원소가 아닐지언정 $G$의 이름표는 $M$의 원소인 것이다.

이제 다음과 같이 $M[G]$를 정의하자.

정의.
\[M[G] = \{\sigma_G : \sigma \in M^\mathbb{P} \}\]

$x \in M$일 때 $\check{x}_G = x$이므로 $M \subset M[G]$이며, $\Gamma_G = G$이므로 $G \in M[G]$이다.

유추했겠다시피, 바로 이 $M[G]$가 우리가 얻고자 했던 “새로운” 집합론의 모델이다. 즉, $M$이 집합론의 추이적 모델일 때 $M[G]$ 또한 집합론의 추이적 모델이다. 이 사실을 증명하기 위해, 우선 강제 관계를 정의하도록 하자.

10. 강제 관계

$G$가 어떤 원소를 가지느냐에 따라 $M[G]$의 생김새는 크게 달라진다. 다시 앞선 $\mathrm{Fn}(\omega, \omega)$의 예시로 돌아가자. 앞서 $f: \omega \to \omega$가 전함수일 때, 필터 $\mathcal{F}_f$가 존재하여 $\bigcup \mathcal{F}_f = f$라고 했다. 이의 역도 어느 정도 성립한다. 만약 $G$가 $\mathrm{Fn}(\omega, \omega)$의 제너릭 필터라면, $\bigcup G: \omega \to \omega$는 전함수이다. 추가로 $(1, 2) \in G$라면, $\bigcup G : 1 \mapsto 2$이다. 이를 다음과 같이 표현해 봄직하다.

\[(1, 2) \text{ forces } \Big(\bigcup G \Big)(1) = 2\]

그런데 우리는 위 진술이 $M$ 안에서도 표현 가능하기를 바라기 때문에, $\mathbb{P}$-이름을 대신 사용할 것이다. 즉,

\[(1, 2) \text{ forces } \Big(\bigcup \Gamma \Big)(\check{1}) = \check{2}\]

이를 엄밀히 정의하면 강제 관계를 얻는다.

정의. $\phi(x_1, \dots, x_n)$가 $n$개의 자유변수를 가지는 명제라고 하자. 임의의 $\tau_1, \dots, \tau_n \in M^\mathbb{P}$와 $p \in \mathbb{P}$에 대해,
\[p \Vdash \phi(\tau_1, \dots, \tau_n)\]
을 다음으로 정의한다: 임의의 $M$ 위의 $\mathbb{P}$-제너릭한 필터 $G$에 대해,
\[p \in G \rightarrow \phi^{M[G]}\big((\tau_1)_G, \dots, (\tau_n)_G\big)\]

(이쯤에서 명제의 성립은 $V$를 기준으로 한다는 사실을 리마인드하는 것이 좋겠다)

$\Vdash$는 forces로 읽는다.

예시로, 다음이 성립한다.

\[(1, 2) \Vdash \Big(\bigcup \Gamma \Big)(\check{1}) = \check{2}\] \[\varnothing \Vdash \bigcup \Gamma \text{ is a function}\]

11. 강제법의 기본 정리

이제 우리는 강제법의 핵심이라고 부를 수 있는 정리에 다다랐다. 안타깝게도 이 정리의 증명은 필자도 아직 공부 중이기에 생략하지만, 관심 있는 독자는 Kunen (1980)을 참고하기를 바란다 (제1정리와 제2정리는 7장, Theorem 3.6에서 등장한다).

강제법 제1정리. $\mathbb{P}$가 주어졌을 때, $M$ 내에서 정의 가능한 관계 $\Vdash^\ast$가 존재하여, 다음이 성립한다: 임의의 명제 $\phi$와 $\mathbb{P}$-이름 $\tau_1, \dots, \tau_n$이 대해,
\[p \Vdash \phi(\tau_1, \dots, \tau_n) \longleftrightarrow \Big( p \Vdash^* \phi(\tau_1, \dots, \tau_n) \Big)^M\]

즉, $M$ 내부에서도 강제 관계 $\Vdash$를 정확하게 흉내낼 수 있다. 또다른 중요한 정리는 다음과 같다.

강제법 제2정리. 임의의 명제 $\phi$와 $\mathbb{P}$-이름 $\tau_1, \dots, \tau_n$이 대해,
\[\bigg( \phi\Big((\tau_1)_G, \dots, (\tau_n)_G \Big) \bigg)^{M[G]} \longleftrightarrow \;\;\exists p \in G : p \Vdash \phi(\tau_1, \dots, \tau_n)\]

즉, 만약 $\phi(\vec{\tau}_G)$가 $M[G]$에서 참이라면, 이는 $\phi(\vec{\tau})$를 강제하는 $p$가 $G$에 있기 때문이다. 직관적으로 이는 다음과 같이 이해할 수 있다. $G$가 제너릭하다는 것은 $G$가 어떠한 우연성이나 특이점 없이 극히 평범하다는 의미이므로, 만약 $M[G]$가 $M$에서는 만족되지 않는 명제 $\phi$를 새롭게 만족하게 되었다면, 이는 외부의 우연에 의해서가 아닌 $G$ 내부 조건에 의한 것이어야 한다.

제1정리와 제2정리로부터, 다음이 얻어진다.

정리. $M$이 집합론의 추이적 모델이라면, $M[G]$ 또한 집합론의 추이적 모델이다.

(이 증명은 필자가 스스로 끄적여 본 것이기 때문에 틀릴 수도 있다.)

증명. 먼저 추이성을 증명하자. $x \in M[G]$이고, $y \in x$라고 하자. $y \in M[G]$를 보여야 한다.

$M[G]$의 정의에 의해 어떤 $\sigma \in M^{\mathbb{P}}$가 있어 $x = \sigma_G$이다. 따라서 $y \in x$로부터 다음을 알 수 있다: 어떤 $\mathbb{P}$-이름 $\tau$와 $p \in G$가 존재하여 $(\tau, p) \in \sigma$이고 $y = \tau_G$이다. $M$이 추이적이므로,

\[(\tau, p) \in \sigma \in M\]

으로부터 $(\tau, p) \in M$임을, 즉 $\tau \in M$임을 알 수 있다. 따라서 $\tau_G = y \in M[G]$이다.

이제 $M[G]$가 ZFC의 모델임을 증명하자. $\theta$가 ZFC의 공리일 때, $\theta^{M[G]}$가 성립함을 보이면 된다. 예를 들어 $\theta$가 $\phi(x, \vec{\tau}_G)$에 대한 분리 공리라고 하자 ($\vec{\tau}_G$는 $M[G]$의 원소들). 즉,

\[\theta \Leftrightarrow \forall x \;\exists y : y = \{z \in x : \phi(z, \vec{\tau}_G)\}\]

이며,

\[\theta \Leftrightarrow \forall x \in M[G] \;\exists y\in M[G] : y = \{z \in x : \phi^{M[G]}(z, \vec{\tau}_G)\}\]

이다.

$\forall$의 정의에 의해 어떤 $x = \sigma_G$에 대해 $\theta^{M[G]}$가 성립함을 보이면 충분하다.

\[y = \{ z \in \sigma_G : \phi^{M[G]}(z, \vec{\tau}_G) \} \in M[G]\]

$y$를 다음과 같이 정리할 수 있다.

\[y = \Big\{ \pi_G : \big( \exists p\in G\; (\pi, p) \in \sigma \big)\land\phi^{M[G]}(\pi_G, \vec{\tau}_G) \Big\}\]

제2정리에 의해,

\[\phi^{M[G]}(\pi_G, \vec{\tau}_G) \longleftrightarrow \exists q \in G: q \Vdash \phi(\pi, \vec{\tau})\]

제1정리에 의해,

\[q\Vdash \phi(\pi, \vec{\tau}) \longleftrightarrow \Big( q \Vdash^* \phi(\pi, \vec{\tau}) \Big)^M\]

따라서,

\[y = \Big\{ \pi_G : \big( \exists p\in G\; (\pi, p) \in \sigma \big)\land \big(\exists q \in G\; (q \Vdash^* \phi(\pi, \vec{\tau}))^M \big) \Big\}\]

여기서 다음의 보조정리를 사용하자.

보조정리. $\sigma \in M^{\mathbb{P}}$일 때, 다음과 같이 정의한다.
\[\overline{\sigma} = \Big\{ (\tau, q) : \exists p \in \mathbb{P} \; \Big( (\tau, p) \in \sigma \land q \leq p \Big) \Big\}\]
이때, $\sigma_G = \overline{\sigma}_G$이며, $\overline{\sigma} \in M^{\mathbb{P}}$이다.

증명은 필터의 정의로부터 거의 자명하다. 보조정리와, 필터의 방향성 및 $\Vdash^\ast$의 방향성으로부터 다음을 얻는다 $(r \leq p, q)$.

\[y = \Big\{ \pi_G : \exists r\in G\; \big( (\pi, r) \in \overline{\sigma} \land (r \Vdash^* \phi(\pi, \vec{\tau}))^M \big) \Big\}\]

따라서,

\[\check{y} = \Big\{ (\pi, r) \in \overline{\sigma} : \big(r \Vdash^* \phi(\pi, \vec{\tau})\big)^M \ \Big\}\]

일 때 $\check{y}_G = y$이다. 그런데 $\check{y}$는 $M$에서 $\sigma$를 $p \Vdash^\ast \phi(\pi, \vec{\tau})$에 대해 분리한 집합이고, $M$은 분리 공리를 만족하므로, $\check{y} \in M$이며, 이에 따라 $y \in M[G]$이다.

나머지 공리에 대해서도 비슷하게 증명할 수 있다. ■

이로써 강제 관계를 정의하고, 그 성질을 살펴본 뒤, 제너릭 필터로부터 주어지는 $M[G]$가 집합론의 모델임을 간단히 증명해 보았다. 다음 글에서는 이 글에서 얻은 결과들을 이용하여 연속체 가설의 무모순성을 증명할 것이다.

Reference

전한울, What is forcing? (2015)
K. Kunen, Set theory: an introduction to independence proofs (1980)
K. Hrbacek & T. Jech, Introduction to Set Theory, 3rd edition (1999)

$y = \mathcal{P}(x)$의 정의는 $\forall z \subseteq x (z \in y) \land \forall z \in y (z \subseteq x)$이고, $y \subseteq x$의 정의는 $\forall z \in y (z \in x)$이다. ↩
프레게-러셀 식의 기술주의가 떠오르는 대목이다. ↩

Introduction to Forcing

28 Jan 2026

Mathematics

Set Theory

Logic

This post was originally written in Korean, and has been machine translated into English. It may contain minor errors or unnatural expressions. Proofreading will be done in the near future.

Gödel, in 1938, introduced the axiom of constructibility and demonstrated that the negation of the continuum hypothesis is unprovable in ZFC. This is because the constructible universe $L$ is a model of ZFC in which the continuum hypothesis holds. Later, in 1963, Cohen developed forcing and discovered a model of ZFC in which the continuum hypothesis does not hold. This established the independence of the continuum hypothesis from ZFC. This post outlines Cohen’s forcing and the proof of the independence of the continuum hypothesis. The proofs are written somewhat loosely, with an emphasis on the basic ideas, so readers should refer to textbooks for more rigorous approach.

It goes without saying that this post assumes the consistency of ZFC.

1. Countable Transitive Models

Let $V$ be the cumulative hierarchy. Then $(V, \in)$ is a “model” of ZFC.

Theorem. There exists a submodel $(M, \in)$ of $V$ satisfying the following properties:

$M$ is a model of ZFC.

$M$ is countable.

$M$ is transitive. That is, for any $x \in M$ and $y \in V$,

\[V \vDash y \in x \implies V \vDash y \in M\]

Proof. The following theorem is used:

Mostowski Collapse Lemma. Let $X$ be a class with a binary relation $R$ satisfying the following:

Smallness: $R^{-1}(x) = \lbrace y \in X : yRx \rbrace$ is a set.

Well-foundedness: Any non-empty subset $S$ of $X$ has an $R$-minimal element.

Extensionality: $x = y$ if and only if $R^{-1}(x) = R^{-1}(y)$.

Then there exists a unique transitive model $(M, \in)$ of $V$ such that $(M, \in)$ is isomorphic to $(X, R)$.

The proof of this lemma is relatively straightforward but lengthy, so refer to Chapter 14 of Hrbacek & Jech instead.

By the downward Löwenheim-Skolem theorem, there exists a countable submodel $(N, \in)$ of $(V, \in)$. Since smallness, well-foundedness, and extensionality are all expressible in first-order logic, it follows that $(N, \in)$ satisfies the conditions of the Mostowski Collapse Lemma if $(V, \in)$ does. Hence, there exists a transitive model $(M, \in)$ isomorphic to $(N, \in)$. Since $N$ and $M$ are isomorphic, $M$ is countable. ■

Thus, $V$ has a countable transitive submodel $M$. The countability and transitivity of $M$ will play an important role in forcing.

2. On Notation

Earlier, we referred to $V$ as a “model” with quotation marks because, strictly speaking, a model must be a set. That is, $V$ is too large to be a model. Therefore, the Löwenheim-Skolem theorem cannot be applied to $V$ directly.

However, this issue can be resolved by replacing $V$ with $V_\kappa$ (where $\kappa$ is an inaccessible cardinal) or with another set-model of ZFC (guaranteed by Gödel’s completeness theorem, assuming the consistency of ZFC). For convenience, this post will treat $V$ as if it were a model (set), but readers uncomfortable with this can replace every occurrence of $V$ in this post with $V_\kappa$.

Additionally, the following conventions will be used:

Conventions.

“Model of set theory” will be used synonymously with “model of ZFC.”

$V \vDash \phi$ will be abbreviated as $\phi$. That is, the background of all statements is $V$.

3. Relativisation

Let $M$ be a transitive model of set theory. Since ZFC is a theory in the language of a single binary relation $\in$, the following is guaranteed for any $x, y \in M$ (note that, by the earlier convention, $x \in y$ on the right-hand side means $V \vDash x \in y$):

\[M \vDash x \in y \iff x \in y\]

However, this does not automatically hold for other relations. For example, for $x, z \in M$,

\[M \vDash y = \mathcal{P}(x)\]

but

\[y \neq \mathcal{P}(x)\]

may hold.¹ Therefore, it is necessary to distinguish between a statement $\phi$ in $V$ and the statement $\phi$ as viewed “internally” in $M$. To this end, we define the following:

Definition. For a statement $\phi$, the relativisation of $\phi$ to $M$, denoted $\phi^M$, is defined recursively as follows:

For $x, y \in M$, $(x \in y)^M$ is $x \in y$.

For $x, y \in M$, $(x = y)^M$ is $x = y$.

$(\phi \land \psi)^M$ is $\phi^M \land \psi^M$.

$(\phi \lor \psi)^M$ is $\phi^M \lor \psi^M$.

$(\lnot \phi)^M$ is $\lnot \phi^M$.

$(\forall x:\phi)^M$ is $\forall x \in M: \phi^M$.

$(\exists x:\phi)^M$ is $\exists x \in M : \phi^M$.

In other words, a statement relativised to $M$ is obtained by restricting the quantifiers in the original statement to $M$.

Remark. By definition, $\phi^M \iff M \vDash \phi$. This may raise the question: why define relativisation when $\vDash$ already exists? The reason is that $M \vDash \phi$ is a meta-statement in model theory, whereas $\phi^M$ is a statement in set theory. That is, relativisation allows the fact that “$\phi$ holds in $M$” to be expressed within $V$. This will become important when defining the forcing relation $\Vdash$.

4. Absoluteness

Definition. Let $M, N$ be transitive “models” of set theory, and suppose $M$ is a sub”model” of $N$. Let $\phi$ be an $n$-ary statement. If, for any $x_1, \dots, x_n \in M$,
\[\phi^M(x_1, \dots x_n) \leftrightarrow \phi^N(x_1, \dots, x_n),\]
then $\phi$ is said to be absolute between $M$ and $N$. If $N = V$, the reference to $V$ is omitted, and $\phi$ is said to be absolute over $M$.

That is, a statement $\phi$ is absolute over $M$ if $\phi$ as viewed internally in $M$ coincides with $\phi$ as viewed externally in $V$.

Definition. A statement is $\Delta_0$ if all its quantifiers are bounded, such as $\exists x \in A$ and $\forall x \in A$.

For example, the definition of $y \subseteq x$ is

\[\forall z \in y (z \in x),\]

which is a $\Delta_0$ statement because $\forall z$ is bounded by $y$.

Theorem.

The composition of absolute statements is absolute.

$\Delta_0$ statements are absolute between transitive models.

Proof. By induction on statements. ■

Corollary. The following statements are absolute between transitive models:

$y \subseteq x$

$f$ is a function, $f$ is surjective, $f$ is injective

The notion of absoluteness can be extended from statements to sets. For example, $\omega$ is defined as the smallest inductive set. Specifically,

\[\mathrm{IsInductive}(x) : 0 \in x \land \forall y( y \in x \rightarrow y\cup \{y\} \in x)\]

and

\[\mathrm{IsOmega}(x) : \mathrm{IsInductive}(x) \land \forall z\Big(\mathrm{IsInductive}(z) \rightarrow z \subseteq x \Big).\]

It can be shown that there exists a unique set satisfying $\mathrm{IsOmega}$, which is defined as $\omega$.

If the definition of a set $s$ is given by a statement $\phi$,² then $s$ is said to be absolute between $M$ and $N$ if $\phi$ is absolute between $M$ and $N$. For example, by the corollary above, $\mathrm{IsOmega}$ is absolute between transitive models. Hence, $\omega$ is absolute between transitive models.

Corollary. The following sets are absolute between transitive models:

$\varnothing$

$x \times y, \lbrace x, y \rbrace, (x, y)$

$\bigcup x, \bigcap x$

$\operatorname{dom} x$ when $x$ is a set of ordered pairs

$\omega\; (=\aleph_0)$

Here are some examples nof non-absoluteness.

Theorem. The following propositions/sets are not absolute between transitive models:

$\mathcal{P}(x)$

$x$ is countable.

$x$ is a cardinal.

$\aleph_n \; (n > 0)$

Let us briefly examine why $\mathcal{P}(x)$ is not absolute. The definition of $\mathcal{P}(x)$ is as follows:

\[y = \mathcal{P}(x) : \underbrace{\forall z \subseteq x (z \in y)}_{(1)} \land \underbrace{\forall z \in y (z \subseteq x)}_{(2)}\]

While (2) is $\Delta_0$ and thus unproblematic, (1) is the issue. Although (1) appears to be a bounded quantifier, it can be expanded as follows:

\[(1) : \forall z (z \subseteq x \rightarrow z \in y)\]

This shows that the quantifier in (1) is unbounded. Therefore, $\mathcal{P}(x)$ is not $\Delta_0$ and, indeed, is not absolute. Similarly, it can be verified that 2, 3, and 4 are also not expressible as $\Delta_0$ statements.

5. Forcing Posets

We now introduce terminology directly related to forcing.

Definition. $(\mathbb{P}, \leq, 1)$ is a forcing poset if $(\mathbb{P}, \leq)$ is a partially ordered set with $1$ as its greatest element.

In the context of forcing, $p \leq q$ is typically interpreted as “$p$ is more detailed than $q$,” “$p$ is stricter,” “$p$ is more specific,” or “$p$ is less likely to occur.” Alternatively, $p$ is an “extension” of $q$. For example:

$p$: The event that the result of rolling a die is an even number less than or equal to 4.
$q$: The event that the result of rolling a die is an even number.

Here, $p$ is more detailed, stricter, and less likely to occur than $q$. Moreover, $p$ is an extension of $q$ since $p$ imposes additional conditions on $q$. Thus, it is natural to define the order $p \leq q$ in the context of forcing.

A more mathematical example is given by:

\[\mathrm{Fn}(\omega, \omega) = \{ f \; | \; f : \omega \to \omega \text{ is a partial function with finite domain}\}\]

with the following order:

\[f \leq g \iff f|_{\operatorname{dom}g} = g\]

For instance, $\lbrace (0, 1), (1, 2) \rbrace \leq \lbrace (1, 2) \rbrace $. That is, $f \leq g$ means that $f$ is a more detailed function than $g$. Alternatively, $f$ is an extension of $g$. Since $\mathrm{Fn}(\omega, \omega)$ has $\varnothing$ as its greatest element, $\mathrm{Fn}(\omega, \omega)$ is a forcing poset.

Unless otherwise stated, $\mathbb{P}$ will be assumed to be a forcing poset.

6. Filters

Definition. A subset $\mathcal{F} \subseteq \mathbb{P}$ is a filter on $\mathbb{P}$ if it satisfies the following:

$1 \in \mathcal{F}$.

If $x \in \mathcal{F}$ and $x \leq y$, then $y \in \mathcal{F}$.

If $x, y \in \mathcal{F}$, then there exists $z \in \mathcal{F}$ such that $z \leq x$ and $z \leq y$.

Filters have been discussed multiple times on this blog, and their key property is condition 3. That is, any two elements of a filter have a common extension. This property gives filters a certain “directedness.” In topology, this directedness is used to make filters converge to a point, and in model theory, filters are used to construct new structures (ultraproducts). Inspired by this, forcing uses filters to construct new models of set theory.

For example, consider the earlier example $\mathbb{P} = \mathrm{Fn}(\omega, \omega)$. For a total function $f: \omega \to \omega$, the following is a filter:

\[\mathcal{F}_f = \{ g \in \mathrm{Fn}(\omega, \omega) : f \leq g\}\]

In particular, $\mathcal{F}_f$ can be thought of as the filter that “represents” the function $f$. Specifically, the following holds:

\[\bigcup \mathcal{F}_f = f\]

In summary, a total function can be obtained from the order of partial functions by defining a filter. This suggests an intriguing possibility: for a model of set theory $M$, $\mathbb{P}$ may be an element of $M$, but the filter $\mathcal{F}$ and $\bigcup \mathcal{F}$ may not be elements of $M$. In that case, one might consider “adding” $\mathcal{F}$ to $M$ to obtain a new model of set theory. This is, in essence, the idea behind forcing.

To elaborate, suppose $M$ is a countable transitive submodel of $V$. We will define a suitable forcing poset $\mathbb{P} \in M$ and a filter $\mathcal{F} \subseteq \mathbb{P}$. In particular, $\mathcal{F}$ will be defined to satisfy certain conditions — specifically, $\mathcal{F}$ must be generic, and $\mathbb{P}$ must be separative — ensuring that $\mathcal{F} \notin M$. By “adding” $\mathcal{F}$ to $M$, we obtain a new model of set theory $M[\mathcal{F}]$. This is conceptually similar to adjoining an element $a$ to a field $F$ to obtain $F(a)$.

Of course, for $M[\mathcal{F}]$ to be a model of set theory, we must add not only $\mathcal{F}$ to $M$ but also all sets that can be obtained from $\mathcal{F}$ and the elements of $M$ through set operations. This is analogous to adjoining $a^2, a + 1, 3a$, etc., when adjoining $a$ to a field $F$. To achieve this systematically, we will introduce the concept of $\mathbb{P}$-names.

7. Denseness and Generic Filters

Definition. A subset $D \subseteq \mathbb{P}$ is dense if the following holds:
\[\forall x \in \mathbb{P} \; \exists y \in D : y \leq x\]

That is, $D$ always contains an element more detailed than any given $x \in \mathbb{P}$. This is analogous to the set $A = \lbrace 0.9, 0.99, 0.999, …\rbrace $ being “dense” around 1, as $A$ always contains elements closer to 1 than any given $x \in \mathbb{R}\setminus\lbrace 1\rbrace $.

Returning to the example $\mathbb{P} = \mathrm{Fn}(\omega, \omega)$, for any $n \in \omega$, the following $D_n$ is dense:

\[D_n = \{ g \in \mathrm{Fn}(\omega, \omega) : n \in \operatorname{dom}g\}\]

This is because any partial function can be extended to include $n$ in its domain.

A dense set $D$ can be understood as representing a “common property.” For instance, the earlier $D_n$ represents the property “having $n$ in the domain,” which is a “common” property since any partial function can be extended to have $n$ in its domain.

Intuitively, if we want a filter $\mathcal{F}$ on $\mathbb{P}$ to be a new element not in $M$, $\mathcal{F}$ must be hard to define. If $\mathcal{F}$ were definable by some $\phi$, then $\exists! x \lnot\phi(x)$ would be a theorem of ZFC, implying $\mathcal{F} \in M$. The difficulty in defining $\mathcal{F}$ means that $\mathcal{F}$ must lack any special characteristics. In other words, $\mathcal{F}$ must satisfy all common properties.

This leads to the following definition:

Definition. Let $M$ be a transitive model of set theory, $\mathbb{P} \in M$, and $G \subseteq \mathbb{P}$ a filter (not necessarily in $M$). If $G \cap D \neq \varnothing$ for every dense subset $D \in M$ of $\mathbb{P}$, then $G$ is said to be $\mathbb{P}$-generic over $M$.

Here, $G \cap D \neq \varnothing$ intuitively means that $G$ satisfies the property represented by $D$. Note that the definition of genericity is relative to the background model $M$.

For a concrete example, let $\mathrm{id}:\omega \to \omega$ be the identity function. The filter $\mathcal{F}_{\mathrm{id}}$ is not generic. This is because:

\[D = \{ f \in \mathrm{Fn}(\omega, \omega) \;|\; \exists n \in \operatorname{dom} f : f(n) \neq n \}\]

is dense, yet $D \cap \mathcal{F} = \varnothing$. This aligns with the fact that $\mathrm{id}$ is easily definable and $\mathrm{id} \in M$.

Conversely, is a generic filter truly not an element of $M$? To ensure this, one additional condition is required.

Definition. For $x, y \in \mathbb{P}$, if there is no $z \in \mathbb{P}$ such that $z \leq x$ and $z \leq y$, then $x$ and $y$ are said to be incompatible, denoted $x \perp y$.

Definition. $\mathbb{P}$ is separative if, for any $z \in \mathbb{P}$, there exist $x, y \leq z$ such that $x \perp y$.

Theorem. Let $M$ be a transitive model of set theory. If $\mathbb{P} \in M$ is separative and $G \subseteq \mathbb{P}$ is a $\mathbb{P}$-generic filter over $M$, then $G \notin M$.

Proof. Suppose $G \in M$. Then $D = \mathbb{P} \setminus G$ is an element of $M$. We show that $D$ is dense. If $D$ is not dense, there exists $x \in \mathbb{P}$ such that $y \leq x$ implies $y \notin D$. That is, $y \in G$. Since $\mathbb{P}$ is separative, there exist $y_1, y_2 \leq x$ such that $y_1 \perp y_2$. Thus, $y_1, y_2 \in G$. By the definition of a filter, there exists $z \in \mathbb{P}$ such that $z \leq y_1, y_2$. However, this contradicts $y_1 \perp y_2$.

Hence, $D$ is dense. Since $G$ is generic, $G \cap D \neq \varnothing$. However, this contradicts the definition of $D$. Therefore, $G \notin M$. ■

Due to the nature of generic filters, constructing explicit examples of generic filters is challenging. Instead, the following theorem guarantees their existence.

Definition. Let $\mathcal{C}$ be a collection of dense subsets of $\mathbb{P}$. A filter $G \subseteq \mathbb{P}$ is $\mathcal{C}$-generic if $G \cap D \neq \varnothing$ for every $D \in \mathcal{C}$.

Theorem. If $\mathcal{C}$ is a countable collection of dense subsets of $\mathbb{P}$, then a $\mathcal{C}$-generic filter exists.

Proof. Let $\mathcal{C} = \lbrace D_n \rbrace _{n \in \omega}$. First, choose $x_0 \in D_0$. If $x_n \in D_n$ has been chosen, choose $x_{n+1} \in D_{n+1}$ such that $x_{n+1} \leq x_n$ (such an $x_{n+1}$ always exists since $D_{n+1}$ is dense). Then $G = \lbrace x_n \rbrace _{n \in \omega}$ is the desired generic filter. ■

Corollary. If $M$ is a countable transitive model, then there exists a $\mathbb{P}$-generic filter over $M$.

Proof. Since $M$ is countable, $\mathcal{C} = \lbrace D \in M : D \text{ is }\mathbb{P}\text{-dense}\rbrace $ is countable. ■

Thus, in countable transitive models, a $\mathbb{P}$-generic filter can always be assumed. This is why countability is crucial, as mentioned earlier.

This post was originally written in Korean, and has been machine translated into English. It may contain minor errors or unnatural expressions. Proofreading will be done in the near future.

8. Martin’s Axiom (A Digression)

When $\mathcal{C}$ is uncountable, a $\mathcal{C}$-generic filter may not exist. However, if the size of $\mathbb{P}$ is restricted, the possible dense sets become limited, and consequently, even uncountable $\mathcal{C}$ may admit a generic filter.

Definition. A subset $A$ of $\mathbb{P}$ is an antichain if, for any $x, y \in A$, $x \perp y$.

Definition. $\mathbb{P}$ satisfies the countable chain condition if every antichain in $\mathbb{P}$ is at most countable.

Antichains and the countable chain condition were discussed in this post, where it was explained that the countable chain condition limits the “width” of a partially ordered set (or tree).

Definition. Let $\kappa$ be a cardinal. $\mathrm{MA}_\kappa$ is the following statement: if $\mathbb{P}$ satisfies the countable chain condition and $\mathcal{C}$ is a collection of $\mathbb{P}$-dense sets with $|\mathcal{C}| \leq \kappa$, then a $\mathcal{C}$-generic filter exists.

By the earlier theorem, $\mathrm{MA}_{\aleph_0}$ trivially holds. On the other hand, $\mathrm{MA}_\kappa$ implies $\kappa < 2^{\aleph_0}$, so $\mathrm{MA}_\kappa$ does not hold for $\kappa \geq 2^{\aleph_0}$.

Martin’s Axiom. $\mathrm{MA}_\kappa$ holds for all $\kappa < 2^{\aleph_0}$.

Martin’s Axiom is known to be independent of ZFC. As one might expect, assuming Martin’s Axiom makes it easier to obtain generic filters, thereby allowing forcing to be used more freely. However, Martin’s Axiom is not necessary for proving the consistency of the continuum hypothesis.

9. $\mathbb{P}$-names

We have seen that if $\mathbb{P}$ is separative and $G$ is $\mathbb{P}$-generic, then $G \notin M$. Thus, it appears that $M$ cannot make any statements about $G$. However, by employing a rather ingenious trick, it is possible to make some statements about $G$ even within $M$. This is, in fact, the brilliance of forcing.

The idea is as follows. In this post, filters were described as collections of “true” elements. That is, if $\mathbb{P}$ is viewed as a collection of events, then $\mathcal{F}$ can be thought of as the collection of events that actually occur.

Building on this, one can conceive of the notion of a set that probabilistically contains elements using $\mathbb{P}$. For instance, if $p \in \mathbb{P}$ actually occurs, then a set $X$ containing $a$ as an element can be written as follows:

\[(a, p) \in X\]

The filter $G$ specifies which events actually occur. If $p \in G$, then $X$ actually contains $a$ as an element. This can be expressed as follows:

\[p \in G \iff a \in X_G\]

Metaphorically, $X$ is in a quantum superposition state, and $X_G$ is its state after observation.

However, $X$, such a probabilistic set, could itself be an element of another set. Conversely, in $(a, p) \in X$, $a$ could also be a probabilistic set. Thus, it is more accurate to write:

\[p \in G \iff a_G \in X_G\]

Formalising the above discussion rigorously yields the definition of $\mathbb{P}$-names.

Definition. A set of the following form is called a $\mathbb{P}$-name:
\[\sigma = \{ (\tau, p) : p \in \mathbb{P}, \tau \text{ is a }\mathbb{P}\text{-name}\}\]
The collection of $\mathbb{P}$-names in $M$ is denoted $M^\mathbb{P}$.

Note that, by definition, $M^\mathbb{P} \subseteq M$, but $M^\mathbb{P} \in M$ is not guaranteed.

Definition. Let $G$ be a filter on $\mathbb{P}$, and let $\sigma$ be a $\mathbb{P}$-name. The valuation of $\sigma$ in $G$, denoted $\sigma_G$, is defined as follows:
\[\sigma_G = \{ \tau_G \;|\; \exists p \in G : (\tau, p) \in \sigma \}\]

Although this definition may appear circular, it is, in fact, a proper recursive definition.[^3] For example, $\varnothing$ is a $\mathbb{P}$-name, and for any $p \in \mathbb{P}$, $\sigma = \lbrace (\varnothing, p) \rbrace$ is also a $\mathbb{P}$-name. In particular,

\[p \in G \iff \sigma_G = \{\varnothing\}\]

A slightly more complex example is the following: for $p, q, r \in \mathbb{P}$,

\[\tau = \bigg\{ \Big(\big\{ ( \{ (\varnothing, r) \}, p), (\varnothing, q) \big\}, q\Big), (\varnothing, r) \bigg\}\]

is a $\mathbb{P}$-name. Verify that if $q, r \in G$ and $p \notin G$, then

\[\tau_G = \{\{ \varnothing\}, \varnothing\}\]

At this point, let us introduce the following notation:

\[\check{x} = \{(y, 1): y \in x\}\]

Recall that $1$ is the greatest element of $\mathbb{P}$. From this, it follows that for any filter $G$, $\check{x}_G = x$. That is, $\check{x}$ is the “name tag” of $x$.

Earlier, it was stated that $\mathbb{P}$-names allow some statements about $G$ to be made within $M$, even though $G \notin M$. This is because the following $\mathbb{P}$-name serves as the “name tag” of $G$:

\[\Gamma = \{ (\check{p}, p) : p \in \mathbb{P} \}\]

The meaning of $\Gamma$ being the name tag of $G$ is that $\Gamma_G = G$ (verify this directly). This is significant because $\Gamma \in M$ since $\mathbb{P} \in M$. In other words, while $G$ may not be an element of $M$, its name tag is.

Now, define $M[G]$ as follows:

Definition.
\[M[G] = \{\sigma_G : \sigma \in M^\mathbb{P} \}\]

Since $\check{x}_G = x$ for $x \in M$, it follows that $M \subset M[G]$. Moreover, since $\Gamma_G = G$, it follows that $G \in M[G]$.

As you might have guessed, $M[G]$ is the “new” model of set theory we sought to obtain. That is, if $M$ is a transitive model of set theory, then $M[G]$ is also a transitive model of set theory. To prove this, we first define the forcing relation.

10. Forcing Relation

The structure of $M[G]$ varies significantly depending on the elements of $G$. Let us return to the earlier example of $\mathrm{Fn}(\omega, \omega)$. Previously, it was stated that if $f: \omega \to \omega$ is a total function, then there exists a filter $\mathcal{F}_f$ such that $\bigcup \mathcal{F}_f = f$. The converse also holds to some extent. If $G$ is a generic filter on $\mathrm{Fn}(\omega, \omega)$, then $\bigcup G: \omega \to \omega$ is a total function. Furthermore, if $(1, 2) \in G$, then $\bigcup G : 1 \mapsto 2$. This can be expressed as follows:

\[(1, 2) \text{ forces } \Big(\bigcup G \Big)(1) = 2\]

However, since we wish for the above statement to be expressible within $M$, we will instead use $\mathbb{P}$-names. That is,

\[(1, 2) \text{ forces } \Big(\bigcup \Gamma \Big)(\check{1}) = \check{2}\]

By defining this rigorously, we obtain the forcing relation.

Definition. Let $\phi(x_1, \dots, x_n)$ be a formula with $n$ free variables. For any $\tau_1, \dots, \tau_n \in M^\mathbb{P}$ and $p \in \mathbb{P}$, we define
\[p \Vdash \phi(\tau_1, \dots, \tau_n)\]
as follows: for any $\mathbb{P}$-generic filter $G$ over $M$,
\[p \in G \rightarrow \phi^{M[G]}\big((\tau_1)_G, \dots, (\tau_n)_G\big)\]

(At this point, it is worth reminding the reader that the truth of a formula is evaluated with respect to $V$.)

The symbol $\Vdash$ is read as “forces.”

As an example, the following statements hold:

\[(1, 2) \Vdash \Big(\bigcup \Gamma \Big)(\check{1}) = \check{2}\] \[\varnothing \Vdash \bigcup \Gamma \text{ is a function}\]

11. Fundamental theorems of forcing

We have now arrived at what may be called the core theorems of forcing. Unfortunately, the proofs are omitted here, as the author is still studying them; interested readers are referred to Kunen (1980) (Theorem 1 and Theorem 2 appear in Chapter 7, Theorem 3.6).

Fundamental Theorem 1 of Forcing. Given $\mathbb{P}$, there exists a relation $\Vdash^\ast$ that is definable within $M$ such that the following holds: for any formula $\phi$ and any $\mathbb{P}$-names $\tau_1, \dots, \tau_n$,
\[p \Vdash \phi(\tau_1, \dots, \tau_n) \longleftrightarrow \Big( p \Vdash^* \phi(\tau_1, \dots, \tau_n) \Big)^M\]

In other words, even inside $M$ one can faithfully simulate the forcing relation $\Vdash$. Another central theorem is the following.

Fundamental Theorem 2 of Forcing. For any formula $\phi$ and any $\mathbb{P}$-names $\tau_1, \dots, \tau_n$,
\[\bigg( \phi\Big((\tau_1)_G, \dots, (\tau_n)_G \Big) \bigg)^{M[G]} \longleftrightarrow \;\;\exists p \in G : p \Vdash \phi(\tau_1, \dots, \tau_n)\]

Thus, if $\phi(\vec{\tau}_G)$ is true in $M[G]$, this is because there is some $p \in G$ that forces $\phi(\vec{\tau})$. Intuitively, one may understand this as follows. To say that $G$ is generic is to say that $G$ is maximally “typical”, in the sense of avoiding accidental coincidences or exceptional behaviour. Hence, if $M[G]$ comes to satisfy a sentence $\phi$ that was not satisfied in $M$, this must be due not to some external accident, but rather to conditions internal to $G$.

From Fundamental Theorems 1 and 2, we obtain the following.

Theorem. If $M$ is a transitive model of set theory, then $M[G]$ is also a transitive model of set theory.

(The proof below was written informally by the author, and may contain mistakes.)

Proof. We first show transitivity. Let $x \in M[G]$, and suppose $y \in x$. We must show that $y \in M[G]$.

By the definition of $M[G]$, there exists some $\sigma \in M^{\mathbb{P}}$ such that $x = \sigma_G$. Hence, from $y \in x$ we obtain that there exist a $\mathbb{P}$-name $\tau$ and some $p \in G$ such that $(\tau, p) \in \sigma$ and $y = \tau_G$. Since $M$ is transitive, from

\[(\tau, p) \in \sigma \in M\]

it follows that $(\tau, p) \in M$, and in particular that $\tau \in M$. Therefore $\tau_G = y \in M[G]$.

We now show that $M[G]$ is a model of ZFC. Let $\theta$ be an axiom of ZFC; it suffices to show that $\theta^{M[G]}$ holds. For instance, suppose that $\theta$ is the Separation axiom for $\phi(x, \vec{c})$ (where $\vec{c}$ are elements of $M[G]$). That is,

\[\theta \Leftrightarrow \forall x \;\exists y : y = \{z \in x : \phi(z, \vec{c})\}\]

and

\[\theta \Leftrightarrow \forall x \in M[G] \;\exists y\in M[G] : y = \{z \in x : \phi^{M[G]}(z, \vec{c})\}\]

By the meaning of $\forall$, it is enough to show that $\theta^{M[G]}$ holds for an arbitrary $x = a$.

\[y = \{ z \in \sigma_G : \phi^{M[G]}(z, \vec{c}) \} \in M[G]\]

By the definition of $M[G]$, there exists some $\vec{\tau} \in M^{\mathbb{P}}$ such that $\vec{c} = \vec{\tau}_G$. Hence we may rewrite this as

\[y = \Big\{ \pi_G : \big( \exists p\in G\; (\pi, p) \in \sigma \big)\land\phi^{M[G]}(\pi_G, \vec{\tau}_G) \Big\}\]

By Fundamental Theorem 2,

\[\phi^{M[G]}(\pi_G, \vec{\tau}_G) \longleftrightarrow \exists q \in G: q\Vdash \phi(\pi, \vec{\tau})\]

By Fundamental Theorem 1,

\[q\Vdash \phi(\pi, \vec{\tau}) \longleftrightarrow \Big( q \Vdash^* \phi(\pi, \vec{\tau}) \Big)^M\]

Therefore,

\[y = \Big\{ \pi_G : \big( \exists p\in G\; (\pi, p) \in \sigma \big)\land \big(\exists q \in G\; (q \Vdash^* \phi(\pi, \vec{\tau}))^M \big) \Big\}\]

At this point we use the following lemma.

Lemma. Given $\sigma \in M^{\mathbb{P}}$, define
\[\overline{\sigma} = \Big\{ (\tau, q) : \exists p \in \mathbb{P} \; \Big( (\tau, p) \in \sigma \land q \leq p \Big) \Big\}\]
Then $\sigma_G = \overline{\sigma}_G$, and $\overline{\sigma} \in M^{\mathbb{P}}$.

The proof is almost immediate from the definition of a filter. From the lemma, the directedness of the filter, and the directedness of $\Vdash^\ast$, we obtain: $(r \leq p, q)$

\[y = \Big\{ \pi_G : \exists r\in G\; \big( (\pi, r) \in \overline{\sigma} \land (r \Vdash^* \phi(\pi, \vec{\tau}))^M \big) \Big\}\]

Hence, if

\[\check{y} = \Big\{ (\pi, r) \in \overline{\sigma} : \big(r \Vdash^* \phi(\pi, \vec{\tau})\big)^M \ \Big\}\]

then $\check{y}_G = y$. But $\check{y}$ is obtained in $M$ by separating $\sigma$ using the predicate $p \Vdash^\ast \phi(\pi, \vec{\tau})$, and since $M$ satisfies Separation, we have $\check{y} \in M$, and hence $y \in M[G]$.

The remaining axioms may be proved similarly. ■

In this way, we defined the forcing relation, examined its properties, and gave a brief proof that $M[G]$ arising from a generic filter is a model of set theory. In the next post, we shall use the results obtained here to prove the consistency of the Continuum Hypothesis.

Reference

Hanul Jeon, What is forcing? (2015)
K. Kunen, Set theory: an introduction to independence proofs (1980)
K. Hrbacek & T. Jech, Introduction to Set Theory, 3rd edition (1999)

$y = \mathcal{P}(x)$ is defined as $\forall z \subseteq x (z \in y) \land \forall z \in y (z \subseteq x)$, and $y \subseteq x$ is defined as $\forall z \in y (z \in x)$. ↩
This is a passage that calls to mind Frege–Russell-style descriptivism. ↩

범주론적 대각합

26 Jan 2026

수학

범주론

대수학

독자는 선형대수학 수업에서 행렬의 대각합_trace을 배웠을 것이다.

정의. $n\times n$ 행렬 $A$의 대각합 $\mathrm{tr}(A)$를, $A$의 대각 성분들의 합으로 정의한다.

특히, 대각합은 다음의 성질을 가진다.

정리. $A$와 $B$가 닮은 행렬이면 $\operatorname{tr}(A) = \operatorname{tr}(B)$이다.

즉, 대각합은 행렬에 대한 함수일 뿐 아니라 선형 변환에 대한 함수이기도 하다.

그런데 이 결과를 알고 나서 대각합의 정의를 다시 보면 불만족스러운(짜치는) 점이 있다. 대각합이 선형 변환에 대한 함수임에도 불구하고, “대각선 성분의 합”이라는 직관적 의미가 있는 행렬의 경우와 달리 선형 변환의 경우에는 직관적 의미가 읽히지 않기 때문이다. 이는 행렬값이 “단위 정사각형이 변환된 후의 넓이”라는 직관적 의미를 가지는 것과 대조된다. 때문에 대각합이 선형 변환에 대한 함수라는 사실은 마치 우연처럼 보인다.

옆집 물리학과 애들이라면 그게 뭐 어째서 ‒ 하며 계산이나 하러 가겠지만, 우리 우수한 수학과의 인재들은 이를 차치해서는 안 될 것이다. 더욱 자연스럽고 수학적인 방식은, 대각합을 먼저 선형 변환에 대한 함수로서 정의한 뒤, 이 정의가 기존의 정의와 일치함을 보이는 것이다. 이것이 범주론적 대각합_{categorical trace}의 동기이다.

먼저 몇 가지 함수를 정의하자. 쌍대 공간의 정의에 따라, 임의의 $k$-벡터 공간 $V$에 대해 다음 사상이 존재한다.

\[\mathrm{ev} : V^* \otimes V \to k;\; \phi \otimes v \mapsto \phi(v)\]

그런데 $(\phi, v) \mapsto \phi(v)$가 쌍선형 사상_{bilinear map}이므로, 텐서곱의 정의에 의해 $\mathrm{ev}$는 선형 사상이다. 따라서 $\mathrm{ev}$의 쌍대를 취할 수 있는데, 이를 $\mathrm{coev}$라고 하자. 즉,

\[\mathrm{coev}: k^* \to (V^* \otimes V)^*\]

그런데 $k^\ast$는 $k$와 표준적_canonically으로 동형이고, $V^{\ast\ast}$는 $V$와 표준적으로 동형이며, $V, W$가 유한 차원 벡터 공간일 때 $(V \otimes W)^\ast$는 $V^\ast \otimes W^\ast$와 표준적으로 동형이므로, $\mathrm{coev}$를 다음과 같이 생각해도 무방하다.

\[\mathrm{coev}: k \to V \otimes V^*\]

Remark. 그냥 동형이 아니라 표준적으로 동형이라는 사실이 중요하다. 일례로 $V^\ast \cong V$이지만, 이 동형은 특정 기저의 선택에 의존적이기 때문에 $\mathrm{coev} : k \to V \otimes V$라고 할 수는 없다. 우리의 목표는 $\mathrm{ev}$와 $\mathrm{coev}$로부터 선형 변환에 대한 함수를 구성하는 것이기 때문에, 기저 의존성이 개입되어서는 안 된다.

Remark. $V, W$가 유한 차원 공간일 때, 다음과 같이 정의되는 $\Psi : V^\ast \otimes W^\ast \to (V \otimes W)^\ast$는 표준 동형 사상이다.
\[\Psi : \phi \otimes \psi \mapsto \big( (v\otimes w) \mapsto \phi(v)\psi(w) \big)\]
유한 차원 조건이 필요한 이유는 전사성을 보장하기 위해서이다. 이 경우, 다음 사상이 위 사상의 역사상임을 확인할 수 있다. ($\lbrace e_i\rbrace , \lbrace f_j\rbrace $는 $V, W$의 기저이고, $\lbrace g_{ij} \rbrace $는 $V \otimes W$의 기저이다.)
\[\Psi^{-1} : g^{ij} \mapsto e^i \otimes f^j\]
표면적으로 $\Psi^{-1}$은 기저 선택에 의존적인 것으로 보이지만, $\Psi$가 기저 의존적이지 않고, $\Psi^{-1}$이 $\Psi$의 역사상이기 때문에, $\Psi^{-1}$ 또한 기저 의존적이지 않다.

$k$는 1차원 벡터 공간이므로, $\mathrm{coev}(1)$을 알면 $\mathrm{coev}$ 전체가 결정된다. $\mathrm{coev}(1)$을 구하기 위해 다음의 정의를 떠올리자. $T : V \to W$에 대해,

\[T^*: W^* \to V^*; \; \psi \mapsto \psi \circ T\]

이다. 따라서,

\[\mathrm{coev} : 1 \mapsto 1 \circ \mathrm{ev} = \mathrm{ev}\]

즉, $\mathrm{coev}(1)$은 다름아닌 $\mathrm{ev}$이다. 이는 말이 되는 게, $\mathrm{ev}$는 $V^\ast \otimes V$를 스칼라로 보내는 선형 사상이므로 그 자체로 $(V^\ast \otimes V)^\ast = V \otimes V^\ast$의 원소이다.

그렇다면 $\mathrm{ev}$는 어떻게 $V \otimes V^\ast$로 표현될까? 우선 $\mathrm{ev}$가 $(V^\ast \otimes V)^\ast$에서 어떻게 표현되는지를 보자. $V$의 기저 $\lbrace e_i\rbrace $를 선택한다면, $V^\ast \otimes V$의 기저가 $\lbrace f_{ij} = e^i \otimes e_j \rbrace $로 유도된다. $\mathrm{ev}$의 정의로부터 다음이 따라 나온다.

\[\mathrm{ev} = \sum_i f^{ii}\]

위의 Remark에서 정의한 $\Psi^{-1}$을 우변에 취하면 다음을 얻는다.

\[\mathrm{ev} = \sum_i e_i \otimes e^i\]

이제 범주론적 대각합을 정의하자.

정의. $k$-벡터 공간 $V$와 선형 변환 $T: V \to V$에 대해, $\operatorname{tr}_T: k \to k$를 다음 사상들의 합성으로 정의한다.
\[k \xrightarrow{\operatorname{coev}} V \otimes V^* \xrightarrow{T \otimes \mathrm{id}} V \otimes V^* \xrightarrow{\operatorname{twist}} V^* \otimes V \xrightarrow{\operatorname{ev}} k\]
$\mathrm{tr}_T$를 범주론적 대각합이라고 한다.

정의에 사용된 사상들이 모두 기저 비의존적이므로, 범주론적 대각합 또한 기저 비의존적이다. 범주론적 대각합과 기존의 대각합은 다음의 관계에 있다.

정리. $\mathrm{tr}_T(1) = \mathrm{tr}(T)$

증명. 다음이 성립한다.

\[\begin{align} 1 &\xmapsto{\operatorname{coev}} \sum e_i \otimes e^i \\ &\xmapsto{T \otimes \mathrm{id}} \sum T(e_i) \otimes e^i \\ &\xmapsto{\operatorname{twist}} \sum e^i \otimes T(e_i) \\ &\xmapsto{\operatorname{ev}} \sum [T]_{ii} \end{align}\]

여기서 $[T]$는 $\lbrace e_i \rbrace $를 기저로 선택했을 때 $T$의 행렬 표현이다. 따라서 $\sum [T]_{ii} = \mathrm{tr}(T)$이다. ■

이로써 구차한 계산 없이 $\mathrm{tr}$이 선형 변환에 대한 함수임을 보일 수 있다. 우아하지 않은가?

한편 예리한 독자는 “범주론적 대각합”이라는 이름이 무색하게, 지금까지의 논의에서 특별히 범주론적인 개념은 등장하지 않았음을 눈치챘을 것이다. 그럼에도 불구하고 이 글에서 정의한 개념의 이름이 “범주론적 대각합”인 이유는, 이 글에서 소개한 정의를 벡터 공간에서 대칭적 모노이드 범주_{symmetric monoidal category}로 자연스럽게 일반화할 수 있기 때문이다. 관심 있는 독자는 위키피디아 문서를 참고하길 바란다.

Categorical Trace

26 Jan 2026

Mathematics

Category Theory

Algebra

This post was originally written in Korean, and has been machine translated into English. It may contain minor errors or unnatural expressions. Proofreading will be done in the near future.

Readers may recall learning about the trace of a matrix in their linear algebra classes.

Definition. For an $n \times n$ matrix $A$, the trace $\mathrm{tr}(A)$ is defined as the sum of the diagonal entries of $A$.

In particular, the trace has the following property:

Theorem. If $A$ and $B$ are similar matrices, then $\operatorname{tr}(A) = \operatorname{tr}(B)$.

In other words, the trace is not merely a function of matrices but also a function of linear transformations.

However, upon revisiting the definition of the trace after learning this result, one might find it somewhat unsatisfactory. While the trace has an intuitive meaning as “the sum of diagonal entries” in the case of matrices, this intuitive interpretation is absent for linear transformations. This contrasts with determinants, which have an intuitive meaning as “the scaling factors of unit squares under the transformation.” As a result, the fact that the trace is a function of linear transformations might seem almost coincidental.

While our neighbours in the physics department might shrug and proceed with their calculations, we, the brilliant minds of the mathematics department, should not settle for such a perspective. A more natural and mathematical approach would be to define the trace first as a function of linear transformations and then show that this definition coincides with the traditional one. This is the motivation for the categorical trace.

Let us first define a few maps. By the definition of the dual space, for any $k$-vector space $V$, the following map exists:

\[\mathrm{ev} : V^* \otimes V \to k;\; \phi \otimes v \mapsto \phi(v)\]

Since $(\phi, v) \mapsto \phi(v)$ is a bilinear map, $\mathrm{ev}$ is a linear map by the definition of the tensor product. Thus, we can take the dual of $\mathrm{ev}$, which we shall denote by $\mathrm{coev}$. That is,

\[\mathrm{coev}: k^* \to (V^* \otimes V)^*\]

However, since $k^\ast$ is canonically isomorphic to $k$, $V^{\ast\ast}$ is canonically isomorphic to $V$, and $(V \otimes W)^\ast$ is canonically isomorphic to $V^\ast \otimes W^\ast$ for finite-dimensional vector spaces $V$ and $W$, we may equivalently think of $\mathrm{coev}$ as follows:

\[\mathrm{coev}: k \to V \otimes V^*\]

Remark. The fact that these are canonical isomorphisms is crucial. For instance, while $V^\ast \cong V$, this isomorphism depends on the choice of a basis, so we cannot write $\mathrm{coev} : k \to V \otimes V$. Our goal is to construct a function of linear transformations from $\mathrm{ev}$ and $\mathrm{coev}$, so basis dependence must be avoided.

Remark. For finite-dimensional spaces $V$ and $W$, the map $\Psi : V^\ast \otimes W^\ast \to (V \otimes W)^\ast$ defined as follows is a canonical isomorphism:
\[\Psi : \phi \otimes \psi \mapsto \big( (v\otimes w) \mapsto \phi(v)\psi(w) \big)\]
The finite-dimensionality condition is necessary to ensure surjectivity. In this case, the following map can be verified as the inverse of the above map (where $\lbrace e_i\rbrace , \lbrace f_j\rbrace $ are bases of $V$ and $W$, and $\lbrace g_{ij} \rbrace $ is a basis of $V \otimes W$):
\[\Psi^{-1} : g^{ij} \mapsto e^i \otimes f^j\]
While $\Psi^{-1}$ appears to depend on the choice of basis, it is not basis-dependent because $\Psi$ itself is not basis-dependent, and $\Psi^{-1}$ is the inverse of $\Psi$.

Since $k$ is a one-dimensional vector space, knowing $\mathrm{coev}(1)$ determines $\mathrm{coev}$ entirely. To compute $\mathrm{coev}(1)$, recall the following definition: for $T : V \to W$,

\[T^*: W^* \to V^*; \; \psi \mapsto \psi \circ T\]

Thus,

\[\mathrm{coev} : 1 \mapsto 1 \circ \mathrm{ev} = \mathrm{ev}\]

In other words, $\mathrm{coev}(1)$ is none other than $\mathrm{ev}$. This makes sense because $\mathrm{ev}$ is a linear map that sends $V^\ast \otimes V$ to scalars, and hence it is itself an element of $(V^\ast \otimes V)^\ast = V \otimes V^\ast$.

How, then, is $\mathrm{ev}$ expressed in $V \otimes V^\ast$? First, let us see how $\mathrm{ev}$ is expressed in $(V^\ast \otimes V)^\ast$. Choosing a basis $\lbrace e_i\rbrace $ of $V$, we obtain the induced basis $\lbrace f_{ij} = e^i \otimes e_j \rbrace $ of $V^\ast \otimes V$. From the definition of $\mathrm{ev}$, we have:

\[\mathrm{ev} = \sum_i f^{ii}\]

Applying $\Psi^{-1}$, as defined in the Remark above, to the right-hand side yields:

\[\mathrm{ev} = \sum_i e_i \otimes e^i\]

Now, let us define the categorical trace.

Definition. For a $k$-vector space $V$ and a linear transformation $T: V \to V$, define $\operatorname{tr}_T: k \to k$ as the composition of the following maps:
\[k \xrightarrow{\operatorname{coev}} V \otimes V^* \xrightarrow{T \otimes \mathrm{id}} V \otimes V^* \xrightarrow{\operatorname{twist}} V^* \otimes V \xrightarrow{\operatorname{ev}} k\]
This $\mathrm{tr}_T$ is called the categorical trace.

Since all the maps used in the definition are basis-independent, the categorical trace is also basis-independent. The relationship between the categorical trace and the traditional trace is as follows:

Theorem. $\mathrm{tr}_T(1) = \mathrm{tr}(T)$

Proof. The following holds:

Here, $[T]$ is the matrix representation of $T$ with respect to the basis $\lbrace e_i \rbrace $. Thus, $\sum [T]_{ii} = \mathrm{tr}(T)$. ■

This demonstrates that $\mathrm{tr}$ is a function of linear transformations without resorting to cumbersome calculations. Elegant, isn’t it?

Meanwhile, the astute reader may have noticed that, despite the name “categorical trace,” no particularly categorical concepts have appeared in the discussion so far. Nevertheless, the concept defined in this post is called the “categorical trace” because the definition introduced here can be naturally generalised from vector spaces to symmetric monoidal categories. Interested readers are encouraged to consult the Wikipedia article for further details.

측도 문제와 연속체 가설

15 Jan 2026

수학

집합론

측도론

정의. 측도 $\mu$가 집합 $X$의 모든 부분집합에 대해 정의될 때, $\mu$를 $X$ 위의 측도라고 한다.

정리. 임의의 $A \subseteq \mathbb{R}$과 $x \in \mathbb{R}$에 대해, $\mu(A + x) = \mu(A)$를 만족하는 실수 위의 측도 $\mu$는 존재하지 않는다. ($A + x = \lbrace a + x : a \in A\rbrace$)

증명. 비탈리 집합의 구성을 참고하라.

즉, 평행이동 불변적인 실수 위의 측도는 존재하지 않는다. 대신 평행이동 불변 조건을 약화하여, 르벡 가측인 집합 $A$에 대해서만 $\mu(A) = m(A)$를 만족하는 실수 위의 측도 $\mu$가 가능하지 않을지 물을 수 있다 ($m$은 르벡 측도). 이것을 측도 문제라고 한다.

측도 문제. 임의의 르벡 가측 집합 $A\subseteq \mathbb{R}$에 대해 $\mu(A) = m(A)$를 만족하는 실수 위의 측도 $\mu$가 존재하는가?

사실 위 문제는 다음의 일반화된 문제와 동치임이 알려져 있다.

일반화된 측도 문제. 비자명한 $2^{\aleph_0}$ 위의 측도가 존재하는가?

만약 그러한 측도가 존재한다면, 해당 측도와 $\mathbb{R}$과 $2^{\aleph_0}$ 사이의 전단사 사상을 사용하여 르벡 측도를 $\mathcal{P}(\mathbb{R})$로 확장할 수 있음이 알려져 있기 때문이다.

Remark. 평행이동 불변 조건이 사라졌기 때문에, 집합 $S$ 위에서 측도의 존재성은 오로지 $S$의 기수에만 의존적이다. 따라서 이후 글에서는 기수 위의 측도에 관해서만 관심을 가진다.

흥미롭게도 측도 문제는 연속체 가설 및 큰 기수 공리와 관련이 있다. 먼저 다음 정리를 보자.

정리. $\nu$ 위의 측도가 존재한다면, 어떤 $\kappa \leq \nu$는 약하게 도달 불가능_{weakly inaccessible}하다.

증명. $\nu$ 위의 측도 $\mu$가 존재한다고 하자. $\mu$에 대한 영집합_{null set}들의 모임 $\mathcal{I}$는 아이디얼_ideal이다. 특히, $\mathcal{I}$가 다음을 만족함을 쉽게 보일 수 있다.

홑원소집합 포함: 임의의 $\xi < \nu$에 대해, $\lbrace \xi \rbrace \in \mathcal{I}$.
가산 닫힘: $\lbrace A_n \rbrace _{n < \omega}$이 $\mathcal{I}$에 속하는 집합들의 모임일 때, $\bigcup_{n < \omega} A_n \in \mathcal{I}$.
쌍대 가산 체인 성질_{co-c.c.c. (countable chain condition)}: $\lbrace B_i \rbrace _{i \in J}$가 $\mathcal{I}$에 속하지 않는 서로소 집합들의 모임일 때, $|J| \leq \aleph_0$.

$\kappa \leq \nu$가 위 1, 2, 3의 조건을 만족하면서 $\kappa$의 부분집합으로 이루어진 아이디얼 $\mathcal{J}$가 존재하는 최소 기수라고 하자. 다음의 보조정리를 증명한다.

정의. $\mathcal{I}$가 아이디얼이라고 하자. 임의의 $\lambda < \kappa$에 대해, $\lbrace A_\xi \rbrace _{\xi < \lambda}$가 $\mathcal{I}$의 원소들의 모임일 때, $\bigcup A_\xi \in \mathcal{I}$라면 $\mathcal{I}$가 $\kappa$-완전_{$\kappa$-complete}하다고 한다.

보조정리. $\mathcal{J}$는 $\kappa$-완전하다.

증명. $\mathcal{J}$가 $\kappa$-완전하지 않다고 하자. 그럼 어떠한 $\lambda < \kappa$와 $\mathcal{J}$의 원소들의 모임 $\lbrace A_\xi \rbrace _{\xi < \lambda}$가 존재하여 $\bigcup A_\xi \notin \mathcal{J}$이다. 필요하다면 다음과 같이 재정의함으로써, 일반성을 잃지 않고 $\lbrace A_\xi\rbrace $가 쌍으로 서로소라고 가정할 수 있다.

\[A_\xi' = A_\xi - \bigcup_{\eta < \xi}A_\eta\]

($A_\xi’ \subseteq A_\xi$이므로 $A_\xi \in \mathcal{J}$로부터 $A_\xi’ \in \mathcal{J}$가 따라 나온다. 또한 $A_\xi’$는 공집합일 수도 있지만, 공집합은 다른 집합과 언제나 서로소이기 때문에 이 사실은 증명에 영향을 미치지 않는다.)

다음과 같이 $\lambda$ 위의 아이디얼 $\mathcal{K}$를 정의한다.

\[S \in \mathcal{K} \iff \bigcup_{\xi \in S} A_\xi \in \mathcal{J}\]

$\mathcal{K}$가 1, 2, 3 조건을 만족함을 보이자.

홑원소집합 포함: $\mathcal{K}$의 정의로부터 자명하다.
가산 닫힘: $\mathcal{J}$가 가산 닫힘이므로 $\mathcal{K}$ 또한 가산 닫힘이다.
쌍대 가산 체인 성질: $\lbrace S_i \rbrace _{i \in J}$가 $\mathcal{K}$에 속하지 않는 서로소 집합들의 모임이라고 하자. 각 $i \in J$에 대해 $B_i = \bigcup_{\xi \in S_i} A_\xi$라고 하자. $B_i \notin \mathcal{J}$이고, $i \neq j$일 때 $B_i \cap B_j = \varnothing$이다 (여기서 $\lbrace A_\xi\rbrace $가 쌍으로 서로소라는 성질이 사용된다). 즉 $\lbrace B_i\rbrace _{i \in J}$는 $\mathcal{J}$의 쌍대 반체인이므로, $J$는 가산이다.

그런데 이는 $\kappa$의 최소성 정의에 모순된다. 따라서 $\mathcal{J}$는 $\kappa$-완전하다. □

위 따름정리로부터 $\kappa$가 약하게 도달 불가능함을, 즉 비가산 정칙 극한 기수임을 보일 수 있다.

$\kappa$는 비가산이다.

홑원소집합 포삼 성질과 가산 닫힘 성질에 의해 $\kappa$가 가산이라면 $\kappa \in \mathcal{J}$가 되어 모순이다.

$\kappa$는 정칙이다.

$\kappa$가 정칙이 아니라면 어떤 $\lambda < \kappa$에 대해 $\kappa$보다 작은 기수들의 증가열 $\lbrace \nu_\xi \rbrace _{\xi < \lambda}$가 존재하여 $\bigcup_{\xi < \lambda} \nu_\xi = \kappa$이다. 홑원소집합 포함 성질과 $\kappa$-완전성에 의해 각 $\nu_\xi$는 $\mathcal{J}$의 원소이다. 따라서 다시 $\kappa$-완전성에 의해 $\bigcup_{\xi < \lambda} \nu_\xi = \kappa$는 $\mathcal{J}$의 원소이다. 그런데 이는 $\mathcal{J}$가 아이디얼이라는 사실에 모순된다.

$\kappa$는 극한 기수이다.

$\kappa$가 극한 기수가 아니라면 어떤 서수 $\alpha$에 대해 $\kappa = \aleph_{\alpha + 1}$이다. 따라서 각 $\gamma < \kappa$에 대해, 전사인 $f_\gamma: \omega_\alpha \to \gamma$가 존재한다. 선택 공리를 사용하여 그러한 함수열 $\lbrace f_\gamma \rbrace _{\gamma < \omega_{\alpha + 1}}$을 얻는다.

이제 각 $\beta < \omega_\alpha, \gamma < \omega_{\alpha + 1}$에 대해 다음과 같이 $A_{\gamma\beta}$를 정의한다.

\[A_{\gamma\beta} = \{ \xi < \omega_{\alpha + 1} : f_\xi(\beta) = \gamma \}\]

다음을 관찰하라.

$\bigcup_{\beta < \omega_\alpha} A_{\gamma\beta} = \omega_{\alpha + 1} - (\gamma + 1)$
$\gamma_1 \neq \gamma_2 \implies A_{\gamma_1 \beta} \cap A_{\gamma_2 \beta} = \varnothing$

특히, $\kappa$-완전성에 의해 $\gamma + 1 \in \mathcal{J}$이므로 $\bigcup_{\beta < \omega_\alpha}A_{\gamma \beta} \notin \mathcal{J}$이다. 따라서 다시 $\kappa$-완전성에 의해 어떤 $\beta$에 대하여 $A_{\gamma\beta} \notin \mathcal{J}$이다. 각 $\gamma$에 대해 그러한 $\beta_\gamma$를 선택하여, $\lbrace A_{\gamma\beta_\gamma}\rbrace _{\gamma < \omega_{\alpha + 1}}$을 얻는다. 그런데 가능한 $\beta_\gamma$의 경우의 수는 $\omega_\alpha$이므로, 비둘기집의 원리에 의해 어떤 $\beta < \omega_{\alpha}$가 존재하여 $|\lbrace A_{\gamma\beta_\gamma} : \beta_\gamma = \beta \rbrace | = \aleph_{\alpha + 1}$이다. 그런데 이는 가산 쌍대 체인 성질에 모순된다. ■

따름정리. 측도 문제의 답이 긍정적이라면 연속체 가설의 답은 부정적이다.

증명. 측도 문제가 성립하고 $2^{\aleph_0} = \aleph_1$이라면, 정리에 의해 $\aleph_0$ 또는 $\aleph_1$이 약하게 도달 불가능해야 하는데, $\aleph_0$는 가산이고 $\aleph_1$은 계승_successor 기수이므로 이는 모순이다. ■

The Measure Problem and the Continuum Hypothesis

15 Jan 2026

Mathematics

Set Theory

Measure Theory

This post was originally written in Korean, and has been machine translated into English. It may contain minor errors or unnatural expressions. Proofreading will be done in the near future.

Definition. When a measure $\mu$ is defined for all subsets of a set $X$, $\mu$ is called a measure on $X$.

Theorem. For any $A \subseteq \mathbb{R}$ and $x \in \mathbb{R}$, there does not exist a measure $\mu$ on the real numbers satisfying $\mu(A + x) = \mu(A)$. (Here, $A + x = \lbrace a + x : a \in A\rbrace$.)

Proof. Refer to the construction of the Vitali set.

Thus, a translation-invariant measure on the real numbers does not exist. Instead, one may weaken the translation-invariance condition and ask whether there exists a measure $\mu$ on the real numbers such that $\mu(A) = m(A)$ for Lebesgue measurable sets $A$, where $m$ denotes the Lebesgue measure. This is called the measure problem.

Measure Problem. Does there exist a measure $\mu$ on the real numbers such that $\mu(A) = m(A)$ for any Lebesgue measurable set $A \subseteq \mathbb{R}$?

In fact, the above problem is known to be equivalent to the following generalised problem:

Generalised Measure Problem. Does there exist a non-trivial probabilistic measure $\mu$ on $2^{\aleph_0}$? (A measure is probabilistic if the measure of the whole space is 1.)

If such a measure exists, it is known that the Lebesgue measure can be extended to $\mathcal{P}(\mathbb{R})$ using this measure and a bijection between $\mathbb{R}$ and $2^{\aleph_0}$.

Remark 1. Since the translation-invariance condition has been removed, the existence of a measure on a set $S$ depends solely on the cardinality of $S$. Therefore, in the subsequent discussion, we focus only on measures on (uncountable) cardinals.

Remark 2. It is known that if $\mu$ is a probabilistic measure on $S$, $C = \lbrace x \in S : \mu(\lbrace x \rbrace ) > 0 \rbrace $ is at most countable. Hence if $|S| = \kappa$ is uncountable, we have $|S| = |S \setminus C|$, and $\mu$ restricted to $S \setminus C$ is a measure satisfying $\mu(\lbrace x \rbrace ) = 0$ for all $x \in S \setminus C$. It follows that if a measure $\mu$ exists on an uncountable cardinal $\kappa$, we may assume without loss of generality $\mu(\lbrace \alpha \rbrace ) = 0$ for all $\alpha < \kappa$. Indeed, we will assume this fact from now on.

Interestingly, the measure problem is related to the continuum hypothesis and large cardinal axioms. Let us first examine the following theorem:

Theorem. If a probabilistic measure exists on $\nu$, then some $\kappa \leq \nu$ is weakly inaccessible.

Proof. Suppose a probabilistic measure $\mu$ exists on $\nu$. The collection $\mathcal{I}$ of null sets with respect to $\mu$ forms an ideal. In particular, it can be easily shown that $\mathcal{I}$ satisfies the following properties:

Contains singletons: For any $\xi < \nu$, $\lbrace \xi \rbrace \in \mathcal{I}$.
Countably closed: If $\lbrace A_n \rbrace_{n < \omega}$ is a collection of sets in $\mathcal{I}$, then $\bigcup_{n < \omega} A_n \in \mathcal{I}$.
Satisfies the countable chain condition: If $\lbrace B_i \rbrace_{i \in J}$ is a collection of pairwise disjoint sets not in $\mathcal{I}$, then $|J| \leq \aleph_0$.

Let $\kappa \leq \nu$ be the smallest cardinal such that there exists an ideal $\mathcal{J}$ on $\kappa$ satisfying the above three conditions. We now prove the following lemma:

Definition. Let $\mathcal{I}$ be an ideal. If for any $\lambda < \kappa$, $\lbrace A_\xi \rbrace_{\xi < \lambda}$ is a collection of sets in $\mathcal{I}$, and $\bigcup A_\xi \in \mathcal{I}$, then $\mathcal{I}$ is said to be $\kappa$-complete.

Lemma. $\mathcal{J}$ is $\kappa$-complete.

Proof. Suppose $\mathcal{J}$ is not $\kappa$-complete. Then there exist some $\lambda < \kappa$ and a collection $\lbrace A_\xi \rbrace_{\xi < \lambda}$ of sets in $\mathcal{J}$ such that $\bigcup A_\xi \notin \mathcal{J}$. Without loss of generality, we may redefine $\lbrace A_\xi \rbrace$ as follows to assume that the sets are pairwise disjoint:

\[A_\xi' = A_\xi - \bigcup_{\eta < \xi} A_\eta\]

(Since $A_\xi’ \subseteq A_\xi$, it follows that $A_\xi’ \in \mathcal{J}$ from $A_\xi \in \mathcal{J}$. Additionally, $A_\xi’$ may be empty, but this does not affect the proof as empty sets are trivially disjoint.)

Define an ideal $\mathcal{K}$ on $\lambda$ as follows:

\[S \in \mathcal{K} \iff \bigcup_{\xi \in S} A_\xi \in \mathcal{J}\]

We verify that $\mathcal{K}$ satisfies the three conditions:

Contains singletons: This is immediate from the definition of $\mathcal{K}$.
Countably closed: Since $\mathcal{J}$ is countably closed, so is $\mathcal{K}$.
Satisfies the countable chain condition: Suppose $\lbrace S_i \rbrace_{i \in J}$ is a collection of pairwise disjoint sets not in $\mathcal{K}$. For each $i \in J$, let $B_i = \bigcup_{\xi \in S_i} A_\xi$. Then $B_i \notin \mathcal{J}$, and $B_i \cap B_j = \varnothing$ for $i \neq j$ (using the disjointness of $\lbrace A_\xi \rbrace$). Thus, $\lbrace B_i \rbrace_{i \in J}$ forms a pairwise disjoint antichain in $\mathcal{J}$, implying that $J$ is countable.

This contradicts the minimality of $\kappa$. Hence, $\mathcal{J}$ is $\kappa$-complete. □

From the above lemma, it follows that $\kappa$ is weakly inaccessible, i.e., an uncountable regular limit cardinal.

$\kappa$ is uncountable.

The singleton containment and countable closure properties imply that if $\kappa$ were countable, $\kappa \in \mathcal{J}$, leading to a contradiction.

$\kappa$ is regular.

If $\kappa$ were not regular, there would exist some $\lambda < \kappa$ and an increasing sequence $\lbrace \nu_\xi \rbrace_{\xi < \lambda}$ of cardinals less than $\kappa$ such that $\bigcup_{\xi < \lambda} \nu_\xi = \kappa$. By the singleton containment and $\kappa$-completeness properties, each $\nu_\xi$ belongs to $\mathcal{J}$. Hence, by $\kappa$-completeness, $\bigcup_{\xi < \lambda} \nu_\xi = \kappa$ belongs to $\mathcal{J}$, contradicting the fact that $\mathcal{J}$ is an ideal.

$\kappa$ is a limit cardinal.

If $\kappa$ were not a limit cardinal, there would exist some ordinal $\alpha$ such that $\kappa = \aleph_{\alpha + 1}$. For each $\gamma < \kappa$, there exists a surjection $f_\gamma: \omega_\alpha \to \gamma$. Using the axiom of choice, we obtain a sequence of such functions $\lbrace f_\gamma \rbrace_{\gamma < \omega_{\alpha + 1}}$.

For each $\beta < \omega_\alpha$ and $\gamma < \omega_{\alpha + 1}$, define:

\[A_{\gamma\beta} = \{ \xi < \omega_{\alpha + 1} : f_\xi(\beta) = \gamma \}\]

Observe the following:

$\bigcup_{\beta < \omega_\alpha} A_{\gamma\beta} = \omega_{\alpha + 1} - (\gamma + 1)$
$\gamma_1 \neq \gamma_2 \implies A_{\gamma_1 \beta} \cap A_{\gamma_2 \beta} = \varnothing$

In particular, by $\kappa$-completeness, $\gamma + 1 \in \mathcal{J}$ implies $\bigcup_{\beta < \omega_\alpha} A_{\gamma \beta} \notin \mathcal{J}$. Hence, for some $\beta$, $A_{\gamma\beta} \notin \mathcal{J}$. Choosing such $\beta_\gamma$ for each $\gamma$, we obtain $\lbrace A_{\gamma\beta_\gamma} \rbrace_{\gamma < \omega_{\alpha + 1}}$. However, the possible values of $\beta_\gamma$ are limited to $\omega_\alpha$, so by the pigeonhole principle, there exists some $\beta < \omega_\alpha$ such that $|\lbrace A_{\gamma\beta_\gamma} : \beta_\gamma = \beta \rbrace| = \aleph_{\alpha + 1}$. This contradicts the countable chain condition. ■

Corollary. If the measure problem has a positive solution, then the continuum hypothesis has a negative solution.

Proof. If the measure problem holds and $2^{\aleph_0} = \aleph_1$, then by the theorem, $\aleph_0$ or $\aleph_1$ must be weakly inaccessible. However, $\aleph_0$ is countable, and $\aleph_1$ is a successor cardinal, leading to a contradiction. ■

교환법칙, 분배법칙, 지수법칙의 범주론적 증명

27 Dec 2025

수학

범주론

정리.

분배법칙: $(A + B) \times C = A \times C + B \times C$

지수법칙: $(A \times B)^C = A^C \times B^C$

이전 글에서 범주론적 합과 곱을 살펴 보았다. 합은 쌍극한의 일종이고, 곱은 극한의 일종이다. 이 관찰로부터, 분배법칙과 지수법칙을 다음 정리로 일반화할 수 있다.

정리. 좌 어드조인트는 쌍극한_colimit을, 우 어드조인트는 극한_limit을 보존한다.

이 정리의 정확한 의미는 다음과 같다. $F: \mathcal{A} \to \mathcal{B}, G: \mathcal{B} \to \mathcal{A}$에 대해 $F \dashv G$라고 하자. $G$가 극한을 보존_{preserves limits}한다는 것은, 임의의 작은 범주_{small category} $\mathbf{I}$와 함자 $D : \mathbf{I} \to \mathcal{B}$에 대해 다음이 성립한다는 것이다.

$\big(B \xrightarrow{q_I} D(I)\big)_{I \in \mathbf{I}}$가 $\mathcal{B}$에서 극한 고깔_{limit cone}일 때, $\big(G(B) \xrightarrow{G(q_I)} GD(I)\big)_{I \in \mathbf{I}}$가 $\mathcal{A}$에서 극한 고깔이다.

이로부터, $G$가 극한을 보존할 때 다음이 성립함을 알 수 있다.

\[G\left(\lim_{\leftarrow \mathbf{I}} D\right) \cong \lim_{\leftarrow \mathbf{I}} G \circ D\]

쌍극한의 경우에는 고깔을 쌍고깔로 바꿔 정의한다.

한편, 임의의 $X, Y, Z \in \mathbf{Set}$에 대해 $\hom(X \times Y, Z) \cong \hom(X, Z^Y)$가 성립한다 (이는 함수형 프로그래밍에서 자주 등장하는 커링_Currying 기법이다). 이로부터 다음의 어드조인트 관계가 따라 나온다.

\[(-) \times Y \dashv (-)^Y\]

따라서 함자 $(-) \times Y$는 쌍극한을 보존하고, $(-)^Y$는 극한을 보존한다. 이로부터 일반화된 분배법칙과 지수법칙을 얻을 수 있다.

정리. 집합 $A, B, Y$에 대해, $A, B$가 서로소라면 다음이 성립한다.

$(A + B) \times Y \cong (A \times Y) + (B \times Y)$

$(A \times B)^Y \cong A^Y \times B^Y$

여기서 $+$는 서로소 집합의 합집합으로, $\sqcup$과 의미가 같으나 분배법칙과의 유사성을 강조하기 위해 사용되었다. 좌변이 $G\left(\lim_{\leftarrow \mathbf{I}} D\right)$에, 우변이 $\lim_{\leftarrow \mathbf{I}} G \circ D$에 대응된다. 위 정리에 집합의 크기 연산을 취하면 자연수에서의 분배법칙과 지수법칙을 얻는다.

참고로 극한에 대한 다음 성질 또한 알려져 있다 ($\bullet, -$ 표기법은 이 글을 참고)

정리. $\mathbf{I}, \mathbf{J}$가 작은 범주이고, $\mathcal{S}$가 $\mathbf{I}, \mathbf{J}$ 모양의 극한들을 가지는 국소적으로 작은 범주라고 하자. $D: \mathbf{I} \times \mathbf{J} \to \mathcal{S}$에 대해, 다음이 성립한다.
\[\lim_{\leftarrow \mathbf{J}}\lim_{\leftarrow \mathbf{I}} D(\bullet, -) \cong \lim_{\leftarrow \mathbf{I} \times \mathbf{J}} D \cong \lim_{\leftarrow \mathbf{I}}\lim_{\leftarrow \mathbf{J}} D(-, \bullet)\]

즉, 극한끼리는 교환법칙을 만족한다. 이것은 덧셈의 교환법칙과 곱셈의 교환법칙을 일반화한 것으로 생각할 수 있다.

Categorical Proofs of the Commutative Law, Distributive Law, and Exponential Law

27 Dec 2025

Mathematics

Category Theory

This post was originally written in Korean, and has been machine translated into English. It may contain minor errors or unnatural expressions. Proofreading will be done in the near future.

Theorem.

Distributive law: $(A + B) \times C = A \times C + B \times C$

Exponential law: $(A \times B)^C = A^C \times B^C$

In a previous post, we explored categorical sums and products. Sums are a type of colimit, while products are a type of limit. From this observation, the distributive and exponential laws can be generalised into the following theorem.

Theorem. Left adjoints preserve colimits, and right adjoints preserve limits.

The precise meaning of this theorem is as follows. Let $F: \mathcal{A} \to \mathcal{B}$ and $G: \mathcal{B} \to \mathcal{A}$, and suppose $F \dashv G$. The statement that $G$ preserves limits means that for any small category $\mathbf{I}$ and functor $D : \mathbf{I} \to \mathcal{B}$, the following holds:

If $\big(B \xrightarrow{q_I} D(I)\big)_{I \in \mathbf{I}}$ is a limit cone in $\mathcal{B}$, then $\big(G(B) \xrightarrow{G(q_I)} G(D(I))\big)_{I \in \mathbf{I}}$ is a limit cone in $\mathcal{A}$.

From this, we can deduce the following when $G$ preserves limits:

\[G\left(\lim_{\leftarrow \mathbf{I}} D\right) \cong \lim_{\leftarrow \mathbf{I}} G \circ D\]

For colimits, the definition is analogous, replacing cones with cocones.

On the other hand, for any $X, Y, Z \in \mathbf{Set}$, the following holds: $\hom(X \times Y, Z) \cong \hom(X, Z^Y)$ (this is a common technique in functional programming known as currying). From this, the following adjoint relationship arises:

\[(-) \times Y \dashv (-)^Y\]

Thus, the functor $(-) \times Y$ preserves colimits, and the functor $(-)^Y$ preserves limits. From this, we can derive the generalised distributive and exponential laws.

Theorem. For sets $A, B, Y$, if $A$ and $B$ are disjoint, the following holds:

$(A + B) \times Y \cong (A \times Y) + (B \times Y)$

$(A \times B)^Y \cong A^Y \times B^Y$

Here, $+$ denotes the disjoint union of sets, which is equivalent to $\sqcup$, but the notation $+$ is used to emphasise its similarity to the distributive law. The left-hand side corresponds to $G\left(\lim_{\leftarrow \mathbf{I}} D\right)$, while the right-hand side corresponds to $\lim_{\leftarrow \mathbf{I}} G \circ D$. By applying cardinality operations to the sets in the above theorem, we recover the distributive and exponential laws for natural numbers.

Additionally, the following property of limits is also known (the $\bullet, -$ notation is explained in this post):

Theorem. Let $\mathbf{I}$ and $\mathbf{J}$ be small categories, and let $\mathcal{S}$ be a locally small category that admits limits of shape $\mathbf{I}$ and $\mathbf{J}$. For $D: \mathbf{I} \times \mathbf{J} \to \mathcal{S}$, the following holds:
\[\lim_{\leftarrow \mathbf{J}}\lim_{\leftarrow \mathbf{I}} D(\bullet, -) \cong \lim_{\leftarrow \mathbf{I} \times \mathbf{J}} D \cong \lim_{\leftarrow \mathbf{I}}\lim_{\leftarrow \mathbf{J}} D(-, \bullet)\]

In other words, limits satisfy the commutative law. This can be thought of as a generalisation of the commutative laws for addition and multiplication.

이전 글 Previous 다음 글 Next

디멘의 블로그 Dimen's Blog 이데아를 여행하는 히치하이커 Alice in Logicland

1. 가산 추이적 모델

2. 표기에 관해

3. 상대화

4. 절대성

5. 강제 부분순서집합

6. 필터

7. 조밀성과 제너릭 필터

(여담) 8. 마틴 공리

9. $\mathbb{P}$-이름

10. 강제 관계

11. 강제법의 기본 정리

Reference

1. Countable Transitive Models

2. On Notation

3. Relativisation

4. Absoluteness

5. Forcing Posets

6. Filters

7. Denseness and Generic Filters

8. Martin’s Axiom (A Digression)

9. $\mathbb{P}$-names

10. Forcing Relation

11. Fundamental theorems of forcing

Reference

$\kappa$는 비가산이다.

$\kappa$는 정칙이다.

$\kappa$는 극한 기수이다.

$\kappa$ is uncountable.

$\kappa$ is regular.

$\kappa$ is a limit cardinal.

디멘의 블로그 Dimen's Blog
이데아를 여행하는 히치하이커

Alice in Logicland