LLM4Rocq · JulesViennotFranca · Dec 17, 2024 · Dec 17, 2024 · Dec 17, 2024 · Dec 17, 2024
diff --git a/.gitignore b/.gitignore
@@ -1,8 +1,12 @@
 .vscode
 .envrc
 .python-version
+miniF2F
+examples/small_miniF2F/coq
 __pycache__
 poetry.lock
 outputs
 logs
-*.jsonl
+*.jsonl
+.DS_Store
+_CoqProject
diff --git a/README.md b/README.md
@@ -1,10 +1,10 @@
 # NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving
 
 NLIR leverage LLMs natural language reasoning ability for theorem proving with the Rocq interactive theorem prover (ITP).
-We propose two interactive proof protocols both leveraging natural language reasoning: 
+We propose two interactive proof protocols both leveraging natural language reasoning:
 
- - Tactic-by-tactic proof construction mimics the typical behavior of a standard Coq user: given the current goals, the agent generates one or several tactics that updates the goals and repeats this process until the proof is complete. 
- - Hierarchical proof templating tries to generate full proofs directly. Failed tactics are then replaced with holes to obtain a proof template. The agent repeats the process of filling each hole until the proof is complete. 
+- Tactic-by-tactic proof construction mimics the typical behavior of a standard Coq user: given the current goals, the agent generates one or several tactics that updates the goals and repeats this process until the proof is complete.
+- Hierarchical proof templating tries to generate full proofs directly. Failed tactics are then replaced with holes to obtain a proof template. The agent repeats the process of filling each hole until the proof is complete.
 
 Our approach’s originality is that although both protocols’ inputs (goals) and outputs (tactics) are Coq code, the agent internally uses natural language as an intermediate representation to analyze the input and guide the code generation.
 We couple both protocols with standard search algorithms leveraging feedback from the ITP and using natural language to rerank proof candidates.
@@ -39,6 +39,7 @@ export OPENAI_PROJECT="your project id"
 The default configuration can be found in `conf/config.yaml`.
 You can override every field (see below).
 E.g., to try the tactics agent without beam search on theorem `foo` defined in `examples/foo.v`:
+
 ```
 $ python nlir-cli.py workspace=examples file=foo.v theorem=foo search.kind=tactics search.mode=naive
 ```
@@ -50,7 +51,7 @@ You should see each iteration of the proof in stdout.
 We use [hydra](https://hydra.cc/docs/intro/) to manage the configurations.
 
 ```bash
-$ python nlir-cli.py --help           
+$ python nlir-cli.py --help
 nlri-cli is powered by Hydra.
 
 There are two possible modes:
@@ -123,6 +124,50 @@ $ python nlir-cli.py benchmark=example
 
 The conversation logs and the configuration will be stored in `./outputs/date/time/`
 
+### Translating
+
+This respository also provides a translation mechanism. Given a natural language description of a theorem along with its Lean and Isabelle codes, the agent will try to translate the theorem in Rocq.
+
+This add-on main purpose is to translate the set of theorems *miniF2F* found [here](https://github.com/facebookresearch/miniF2F/tree/main). So if used on another set of theorems, the set should have the same structure as the [small_miniF2F](./examples/small_miniF2F/) folder.
+
+There are two modes for this program, one can translate a single theorem by specifying its name or translate the whole set if no theorem is specified.
+
+Additionally, you can also supervise the model by giving him remarks or hints after each error.
+
+The default configuration can be found in `conf/translate_config.yaml`. Every field can be overriden using the [hydra](https://hydra.cc/docs/intro/) syntax:
+
+```bash
+$ python translate-cli.py --help
+translate-cli is powered by Hydra.
+
+There are two possible modes:
+- Use the option `theorem=my_theorem` to translate one theorem.
+- With no theorem specified, all theorems are translated.
+
+To enable supervision of the model, use the option `supervise=true`.
+
+Alternatively you can use your own config file with the option `--config-name my_translate_config.yaml`.
+Config files should be in the `conf` directory.
+
+== Config ==
+Override anything in the config (foo.bar=value)
+
+workspace: examples/small_miniF2F # Path to the set of theorems
+theorem: null                     # Name of the single theorem to be translated
+supervise: false                  # Supervised translation or not
+log_dir: "logs/translate"         # Directory to store the log files
+
+agent:
+  model_id: qwencode7b       # LLM id
+  temperature: 1.0           # Temperature used
+  local: true                # Is the model local or not
+  provider: ollama           # Provider of the model
+
+
+Powered by Hydra (https://hydra.cc)
+Use --hydra-help to view Hydra specific help
+```
+
 ### Using custom config files
 
 You can also try one of the pre-defined benchmark configuration files (or write your own), and override some parameters, e.g.:

diff --git a/conf/myconfig.yaml b/conf/myconfig.yaml
diff --git a/conf/my_config.yaml → conf/translate_config.yaml b/conf/my_config.yaml → conf/translate_config.yaml
@@ -1,40 +1,33 @@
-workspace: "../my_nlir/fminiF2F/coq/"
-file: null
+workspace: "examples/small_miniF2F"
 theorem: null
-replay: false
-benchmark: null
-start_theorem: 0
-end_theorem: 300
-num_theorems: 300
-log_dir: "logs"
+supervise: false
+log_dir: "logs/translate"
 weave: false
 
 petanque:
   address: 127.0.0.1
   port: 8765
   timeout: 10
   run_opts: null
-  context: true
 
 search:
-  kind: template
   mode: naive
   max_steps: 10
   beam_size: 3
   n_responses: 4
 
 hydra:
   help:
-    app_name: nlri-cli
+    app_name: translate-cli
 
     template: |
       ${hydra.help.header}
       There are two possible modes:
-      - Use options `file=my_file.v` and `theorem=my_thm` to prove one theorem.
-      - Use option `benchmark=my_bench.yaml` to test a full benchmark.
+      - Use options `theorem=my_thm` to translate one theorem.
+      - With no theorem specified, all theorems are translated.
 
       Alternatively you can use your own config file with the option `--config-name myconf.yaml`.
-      Config files should be in the `conf` directory.
+      Config files should be in the `conf/translate` directory.
 
       == Config ==
       Override anything in the config (foo.bar=value)
@@ -43,9 +36,9 @@ hydra:
 
       ${hydra.help.footer}
 
+# @package _global_
 defaults:
   - _self_
-  - agent: gpt-mini
-  - benchmark: null
+  - agent: qwencode7b
   - override hydra/job_logging: none
   - override hydra/hydra_logging: none
diff --git a/examples/small_miniF2F/informal/test/mathd_algebra_478.json b/examples/small_miniF2F/informal/test/mathd_algebra_478.json
@@ -0,0 +1 @@
+{"problem_name": "mathd_algebra_478", "informal_statement": "The volume of a cone is given by the formula $V = \\frac{1}{3}Bh$, where $B$ is the area of the base and $h$ is the height. The area of the base of a cone is 30 square units, and its height is 6.5 units. What is the number of cubic units in its volume? Show that it is 65.", "informal_proof": "We are given that $B = 30$ and $h = 6.5$ and asked to find $\\frac{1}{3}Bh$.  We find that \\[\\frac{1}{3}Bh = \\frac{1}{3}(30)(6.5) = (10)(6.5) = 65.\\]"}
diff --git a/examples/small_miniF2F/informal/valid/amc12a_2015_p10.json b/examples/small_miniF2F/informal/valid/amc12a_2015_p10.json
@@ -0,0 +1 @@
+{"problem_name": "amc12a_2015_p10", "informal_statement": "Integers $x$ and $y$ with $x>y>0$ satisfy $x+y+xy=80$. What is $x$?\n\n$ \\textbf{(A)}\\ 8 \\qquad\\textbf{(B)}\\ 10 \\qquad\\textbf{(C)}\\ 15 \\qquad\\textbf{(D)}\\ 18 \\qquad\\textbf{(E)}\\ 26$ Show that it is \\textbf{(E)}\\ 26.", "informal_proof": "Use [[SFFT]] to get $(x+1)(y+1)=81$. The terms $(x+1)$ and $(y+1)$ must be factors of $81$, which include $1, 3, 9, 27, 81$. Because $x > y$, $x+1$ is equal to $27$ or $81$. But if $x+1=81$, then $y=0$ and so $x=\\textbf{(E)}\\ 26$."}
diff --git a/examples/small_miniF2F/isabelle/test/mathd_algebra_478.thy b/examples/small_miniF2F/isabelle/test/mathd_algebra_478.thy
@@ -0,0 +1,17 @@
+(*
+  Authors: Wenda Li
+*)
+
+theory mathd_algebra_478 imports Complex_Main
+begin
+
+theorem mathd_algebra_478:
+  fixes b h v ::real
+  assumes "0 < b \<and> 0 < h \<and> 0 < v"
+      and "v = 1 / 3 * (b * h)"
+      and "b = 30"
+      and "h = 13 / 2"
+    shows "v = 65"
+  using assms by auto
+
+end
diff --git a/examples/small_miniF2F/isabelle/valid/amc12a_2015_p10.thy b/examples/small_miniF2F/isabelle/valid/amc12a_2015_p10.thy
@@ -0,0 +1,18 @@
+(*
+  Authors: Albert Qiaochu Jiang
+*)
+
+
+theory amc12a_2015_p10 imports
+Complex_Main
+begin
+
+theorem amc12a_2015_p10:
+  fixes x y:: nat
+  assumes h0: "0<y"
+    and h1: "y<x"
+    and h2: "x+y + (x*y) = 80"
+  shows "x=26"
+  sorry
+
+end
diff --git a/examples/small_miniF2F/lean/src/minif2f_import.lean b/examples/small_miniF2F/lean/src/minif2f_import.lean
@@ -0,0 +1,87 @@
+import algebra.algebra.basic
+import algebra.order.floor
+import algebra.associated
+import algebra.big_operators.basic
+import algebra.big_operators.enat
+import algebra.big_operators.order
+import algebra.big_operators.pi
+import algebra.geom_sum
+import algebra.group.pi
+import algebra.group.commute
+import algebra.group_power.basic
+import algebra.group_power.identities
+import algebra.order.floor
+import algebra.quadratic_discriminant
+import algebra.ring.basic
+import analysis.asymptotics.asymptotic_equivalent
+import analysis.mean_inequalities
+import analysis.normed_space.basic
+import analysis.inner_product_space.basic
+import analysis.inner_product_space.euclidean_dist
+import analysis.normed_space.pi_Lp
+import analysis.special_functions.exp
+import analysis.special_functions.exp_deriv
+import analysis.special_functions.log
+import analysis.special_functions.logb
+import analysis.special_functions.log_deriv
+import analysis.special_functions.pow
+import analysis.special_functions.sqrt
+import analysis.special_functions.trigonometric.basic
+import analysis.special_functions.trigonometric.complex
+import combinatorics.simple_graph.basic
+import data.complex.basic
+import data.complex.exponential
+import data.finset.basic
+import data.fintype.card
+import data.int.basic
+import data.int.gcd
+import data.int.modeq
+import data.int.parity
+import data.list.intervals
+import data.list.palindrome
+import data.multiset.basic
+import data.nat.basic
+import data.nat.choose.basic
+import data.nat.digits
+import data.nat.factorial.basic
+import data.nat.fib
+import data.nat.modeq
+import data.nat.multiplicity
+import data.nat.parity
+import data.nat.prime
+import data.pnat.basic
+import data.pnat.prime
+import data.polynomial
+import data.polynomial.basic
+import data.polynomial.eval
+import data.rat.basic
+import data.real.basic
+import data.real.ennreal
+import data.real.irrational
+import data.real.nnreal
+import data.real.sqrt
+import data.real.golden_ratio
+import data.set.finite
+import data.sym.sym2
+import data.zmod.basic
+import dynamics.fixed_points.basic
+import field_theory.finite.basic
+import geometry.euclidean.basic
+import geometry.euclidean.circumcenter
+import geometry.euclidean.monge_point
+import geometry.euclidean.sphere
+import init.data.nat.gcd
+import linear_algebra.affine_space.affine_map
+import linear_algebra.affine_space.independent
+import linear_algebra.affine_space.ordered
+import linear_algebra.finite_dimensional
+import logic.equiv.basic
+import measure_theory.integral.interval_integral
+import number_theory.arithmetic_function
+import number_theory.legendre_symbol.quadratic_reciprocity
+import number_theory.primes_congruent_one
+import order.bounds
+import order.filter.basic
+import order.well_founded
+import topology.basic
+import topology.instances.nnreal
diff --git a/examples/small_miniF2F/lean/src/test.lean b/examples/small_miniF2F/lean/src/test.lean
@@ -0,0 +1,24 @@
+/-
+Copyright (c) 2021 OpenAI. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Kunhao Zheng, Kudzo Ahegbebu, Stanislas Polu, David Renshaw, OpenAI GPT-f
+-/
+import minif2f_import
+
+open_locale big_operators
+open_locale nat
+open_locale real
+open_locale rat
+
+theorem mathd_algebra_478
+  (b h v : ℝ)
+  (h₀ : 0 < b ∧ 0 < h ∧ 0 < v)
+  (h₁ : v = 1 / 3 * (b * h))
+  (h₂ : b = 30)
+  (h₃ : h = 13 / 2) :
+  v = 65 :=
+begin
+  rw [h₂, h₃] at h₁,
+  rw h₁,
+  norm_num,
+end
diff --git a/examples/small_miniF2F/lean/src/valid.lean b/examples/small_miniF2F/lean/src/valid.lean
@@ -0,0 +1,21 @@
+/-
+Copyright (c) 2021 OpenAI. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Kunhao Zheng, Stanislas Polu, David Renshaw, OpenAI GPT-f
+-/
+import minif2f_import
+
+open_locale big_operators
+open_locale real
+open_locale nat
+open_locale topological_space
+
+theorem amc12a_2015_p10
+  (x y : ℤ)
+  (h₀ : 0 < y)
+  (h₁ : y < x)
+  (h₂ : x + y + (x * y) = 80) :
+  x = 26 :=
+begin
+  sorry
+end
diff --git a/nlir-cli.py b/nlir-cli.py
@@ -157,6 +157,7 @@ def main(cfg: DictConfig):
             results["success"].append(status.success)
             results["steps"].append(status.steps)
             with open(res_path, "w") as rf:
+                print(f"Result writen in {res_path}")
                 json.dump(results, rf, indent=2)
 
         print(f"\n\n--- Summary ---")
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		{"problem_name": "mathd_algebra_478", "informal_statement": "The volume of a cone is given by the formula $V = \\frac{1}{3}Bh$, where $B$ is the area of the base and $h$ is the height. The area of the base of a cone is 30 square units, and its height is 6.5 units. What is the number of cubic units in its volume? Show that it is 65.", "informal_proof": "We are given that $B = 30$ and $h = 6.5$ and asked to find $\\frac{1}{3}Bh$. We find that \\[\\frac{1}{3}Bh = \\frac{1}{3}(30)(6.5) = (10)(6.5) = 65.\\]"}
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		{"problem_name": "amc12a_2015_p10", "informal_statement": "Integers $x$ and $y$ with $x>y>0$ satisfy $x+y+xy=80$. What is $x$?\n\n$ \\textbf{(A)}\\ 8 \\qquad\\textbf{(B)}\\ 10 \\qquad\\textbf{(C)}\\ 15 \\qquad\\textbf{(D)}\\ 18 \\qquad\\textbf{(E)}\\ 26$ Show that it is \\textbf{(E)}\\ 26.", "informal_proof": "Use [[SFFT]] to get $(x+1)(y+1)=81$. The terms $(x+1)$ and $(y+1)$ must be factors of $81$, which include $1, 3, 9, 27, 81$. Because $x > y$, $x+1$ is equal to $27$ or $81$. But if $x+1=81$, then $y=0$ and so $x=\\textbf{(E)}\\ 26$."}