Writing New Systems
This is a step-by-step guide for developing new classes in the Systems submodule. We advise you read the Design Philosophy first.
When to Add a New System
Each user-facing class in the Systems submodule is designed to handle a particular file format; therefore, if you want a new file format to be able to define a system, you will need a new System class.
The name of this class doesn’t have to reference the file format name.
Instead your new class should reference how the system is defined.
For example, while the Integral class
loads FCIDUMP files the name “Integral” is a reference to how the
system is defined by a set of integrals (rather than, say,
directly specifying a Hamiltonian matrix as with
MatrixHamiltonian).
For this tutorial, we’ll call our class Demo since it’s job
is to demonstrate how to write a new System class.
Adding a New File
Each unique system should be defined in its own Python file located in the
src/pydmqmc/systems/ directory.
The filename should be obviously connected to your new class. For example,
the MatrixHamiltonian class is in hamiltonian.py.
The file name should be written in lowercase letters and use underscores to separate words.
For this example, we’ll create src/pydmqmc/systems/demo.py and open it for editing.
Coding Your Class
We’ll walk through what’s needed to create your class in the sections below. This includes any import statements, defining the class, writing its initialization method, and what other methods (public or private) you may wish to add.
Import Statements
At minimum, you must import the base pydmqmc.systems.System
class for inheritance:
from .system import System
You may wish to import the utility module using
from .. import utils
All utility functions will be available in your code as utils.function_name.
Note
These imports (from .system and from ..) mean
we are importing from files rather than the pydmqmc library. The former imports
from the system.py file in the same directory as the file we are editing
while the latter imports the utils folder from the directory above our current one.
Importing directly from files and directories like this instead of
importing from the pydmqmc package
(e.g., from pydmqmc.systems import System) helps prevent circular imports.
For good type hinting, you may also wish to import the following NumPy types depending on your needs:
from numpy.typing import NDArray as Array
from numpy.typing import ArrayLike
You may of course import any other packages or routines that you may need to
help you write your code, including NumPy and SciPy routines. If you are only
using particular functions from these libraries, please import them individually
using from.
Defining Your Class
After you’ve set up your imports, its time to define your new class.
We’ll go ahead and define the Demo class.
This definition will include our inheritance of the base
System class, an example docstring following
the numpydoc standard, and the standard __init__ initialization function
Note that the input_file and is_complex parameters are required for all Systems:
class Demo(System): # we inherit from the base System class
r"""
Write a one-sentence summary of the class here.
You can elaborate more down below. Perhaps this elaboration involves
some math or other symbols, which is why we added the 'r' at the beginning
of this docstring. The rest of this docstring should be formatted
according to the numpydoc expectations.
At minimum you should include a "Parameters" section which documents all
the parameters needed for the __init__ function.
You will likely not need an "Attributes" section because any publicly
accessible attributes are actually methods tagged with the `@property`
decorator.
Parameters
----------
input_file : str
Filename to load.
n_orbitals : int
The number of orbitals in our system.
is_complex : bool, default False
Whether or not the Hamiltonian is complex.
eigenvalues : array_like, optional
Set of eigenvalues for the system.
"""
def __init__(
self,
input_file: str, # type hints
n_orbitals: int,
is_complex: bool = False,
eigenvalues: ArrayLike | None = None
) -> None:
# No docstring is needed
# because we wrote the docstring under the class definition
Important
Notice that the parameters for __init__ are documented in the Parameters
section of the class docstring. Type hints and default values match between the code
and the docstring.
Note
Class names should be written using “Pascal case,” often stylized as
PascalCase. As the stylization suggests, the first letter of every
word should be capitalized and no spaces or other separators should be used.
Structure of the Initialization Method
Let’s dive into the structure of the __init__ method. This is the function that gets
run any time we instantiate a Demo object; therefore, this method should do
everything we need to setup our system. The base System class
will take care of some of this for us. The basic structure is as follows:
Call
super().__init__with a subset of the parameters passed to the surrounding__init__function.Set the Expected Private Attributes with system-specific initialization, such as reading its particular filetype.
Call
super()._set_derived_quants()to set some additional attributes usingself._orbsymandself._norb.
We can see this in a sample code block:
def __init__(self,
input_file: str, # type hints
n_orbitals: int,
is_complex: bool = False,
eigenvalues: ArrayLike | None = None) -> None:
# No docstring is needed
# because we wrote the docstring under the class definition
# First, call the base class's initialization function.
# This will set the input_file and is_complex attributes.
# Other attributes will be set to None.
super().__init__(input_file=input_file,
is_complex=is_complex)
# Now, we need to do initialization that's specific to our class.
# Some can be initialized from parameters...
self._norb = n_orbitals
if eigenvalues is not None: # requires us to import numpy as np
if isinstance(eigenvalues, np.ndarray):
self._eig = eigenvalues
else:
self._eig = np.array(eigenvalues)
# ...others will need to be read in from file
self._read_file(input_file)
# The System class has a private function for defining some derived attributes
# assuming self._orbsym and self._norb have been defined already.
# If these attributes haven't been defined, then this function will do nothing.
# This is the VERY LAST thing that should be done in __init__
super()._set_derived_quants()
Note
The super function is a special Python function that resolves complex
chains of inheritance. It’s best practice to use super to access
methods from parent classes rather than invoking them directly (e.g.
System.__init__).
Expected Private Attributes
While the System API documentation outlines how users may
access attributes stored within a class, each of these maps to a private attribute
that should be used internally by the class. This is to adhere to the design principle
that System Objects are Static. The mapping between user-facing name in the API documentation
and internal private attributes is outlined alphabetically below.
These attributes should be defined through a combination of user arguments
(passed to __init__) and the input_file read during initialization.
You may allow attributes to go undefined; however, this runs counter to the
Maximize Interoperability design guideline.
Public Attribute |
Private Attribute |
|---|---|
bitarrays |
_bitarrays |
eigenvalues |
_eig |
excitation_matrix |
_nex_mat |
hamiltonian |
_H |
input_file |
_input_file |
is_complex |
_is_complex |
max_symmetery |
_maxsym |
n_alpha |
_na |
n_beta |
_nb |
n_determinants |
_ndets |
n_electrons |
_nel |
n_orbitals |
_norb |
orbital_pg_symmetry |
_orbsym |
orbitals |
_orbs |
pg_mask |
_pg_mask |
ref_energy |
_ref_eng |
spin_polarizations |
_ms |
Writing Other Functions
You are free to add any additional functions that its reasonable for your class
to need or support. For example, in the above section we indicated that __init__
should call a private _read_file function. Such a function would be defined as follows:
def _read_file(self) -> None:
"""Private functions aren't required to have docstrings, but they're helpful"""
with open(self._input_file, "r") as f:
# process the lines of the file
...
You may also wish to define generate functions such as generate_hamiltonian.
Such functions may be used by Method classes to accomplish tasks that are computationally
expensive to perform at initialization. The DMQMC methods, for instance, count on the
presence of a generate_hamiltonian method if the hamiltonian attribute is None.
Note
Functions should be named following the “lower with under” convention
(stylized as lower_with_under). As the stylization suggests, words should
be lowercase and underscores should be used between words. Function names that
start with an underscore will be private and only available within class
definitions (either by the class itself or to any child classes).
Adding Your Class to the Package
Now that your class has been created, its time to add it to pydmqmc!
Open the file src/pydmqmc/systems/__init__.py. This is the file
that defines what get included when you run import pydmqmc.systems.
In this file, you’ll need to add a line importing your new class from the file it lives in. For example:
from .demo import Demo
If you installed pydmqmc in editable mode,
the new Demo class will be available instantly. If not,
you’ll need to reinstall pydmqmc.
Either way, you can access your new class is now accessible via the Systems submodule:
from pydmqmc.systems import Demo
Linting and Formatting
Python developers will use software called “linters” and “formatters” to help keep their code clean. This improves readability for all developers on a project—including your future self. Linting focuses on the quality of your code itself while formatting fixes the visual style of the source code.
The pydmqmc library uses Ruff for both linting and formatting. From the top level directory of the pydmqmc package, run:
ruff format
ruff check
The first command will automatically clean up any formatting inconsistencies.
The second will return a report with fixes you should make yourself, such as
removing unused import statements. Some of these may be automatically fixable
with ruff check --fix. Others may need to be ignored; you can add known, ignorable
issues to the tool.ruff.lint.per-file-ignores of the pyproject.toml.
Writing Tests
It’s a good idea to write unit and regression tests for your new class both to ensure everything is working as you expect and to ensure future changes don’t break your code.
Tests for the system classes live in the tests/systems/ directory.
You’ll want to create a new file with the prefix test_ followed by the
file name your new class lives in; for example, tests/systems/test_demo.py.
This guide doesn’t have the space to explain how to write good tests, but you should make sure any public methods (including class initialization) work with a wide range of expected user inputs. You can test for cases where the code should fail as well as instances where it should give a correct answer or provide a certain result.
Documenting Your Class
Software is worth more if its documented! There are a few places you should document your new system class to make it more accessible to future users. The pydmqmc documentation is written using reStructuredText. You can always view your newly created documentation by Building the Documentation.
There are two places your system should be documented. One is the Using the Available Systems page which describes the context in which each system should be used. This page also notes any caveats about the systems. The second is the API Reference, which gives details about how classes should be invoked and what members they contain.
Extending the System Reference
The Using the Available Systems page details all systems that are available within
pydmqmc. This page’s source code is found in docs/systems.rst.
You should add your new system here following the existing style,
as shown in the example below:
.. define a tag for referencing this section elsewhere
.. _demo-systems:
Demo Systems
------------
This class should be used for defining demonstration systems.
It can optionally accept a list of eigenvalues, which may allow it
to work with a broader range of Methods. All such caveats should
be explained here.
Extending the API Reference
Within the API Reference, there is a page dedicated to the
Systems Submodule that summarizes both the base System
class and its child classes. The child classes are described via links
to their own individual pages.
First, create a new individual page for your class in docs/source/api/systems/.
For the example we’ve been using here, the page will be called docs/source/api/systems/demo.rst.
The contents of this page are pretty minimal. Replace all references to “demo”
in the following example with your own class name:
.. _api-demo:
Demo
====
.. autoclass:: pydmqmc.systems.Demo
The autoclass directive means that API documentation will be automatically generated
from the class’s source code (including it’s docstring!) when the documentation is built.
With the individual page created, open docs/source/api/systems/index.rst.
Near the top of this file is the toctree directive, which is short for
“table of contents tree.” Individual filenames can be added to this block
in order to create a table of contents on the page. Without modifying anything
else about the block, go ahead an add the name of the file you created above:
.. toctree::
:maxdepth: 1
integral
matrix-hamiltonian
demo