pax_global_header00006660000000000000000000000064147715427250014530gustar00rootroot0000000000000052 comment=a81b0688d5d4174a45a07eb8971801d85c5d9e6a yarsync-0.3.1/000077500000000000000000000000001477154272500132215ustar00rootroot00000000000000yarsync-0.3.1/.gitattributes000066400000000000000000000000211477154272500161050ustar00rootroot00000000000000*.py diff=python yarsync-0.3.1/.gitignore000066400000000000000000000001401477154272500152040ustar00rootroot00000000000000yarsync.egg-info/ dist/ .coverage .pyc .tox tests/.hypothesis/ docs/build build **/__pycache__/ yarsync-0.3.1/.readthedocs.yaml000066400000000000000000000020051477154272500164450ustar00rootroot00000000000000# Read the Docs configuration file for Sphinx projects # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details # Required version: 2 # Set the OS, Python version and other tools you might need build: os: ubuntu-22.04 tools: python: "3.12" # You can also specify other tool versions: # nodejs: "20" # rust: "1.70" # golang: "1.20" # Build documentation in the "docs/" directory with Sphinx sphinx: configuration: docs/source/conf.py # You can configure Sphinx to use a different builder, for instance use the dirhtml builder for simpler URLs # builder: "dirhtml" # Fail on all warnings to avoid broken references # fail_on_warning: true # Optionally build your docs in additional formats such as PDF and ePub formats: - pdf - epub # Optional but recommended, declare the Python requirements required # to build your documentation # See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html python: install: - requirements: docs/requirements.txt yarsync-0.3.1/LICENSE000066400000000000000000001045151477154272500142340ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . yarsync-0.3.1/NEWS.rst000066400000000000000000000202761477154272500145360ustar00rootroot00000000000000=========================== YARsync 0.3 =========================== YARsync minor release 0.3 was done on March 28, 2025. This release does not add much functionality, but fixes several bugs and improves documentation and technical features of the project. What's new ---------- * *clone* command gets a ``--force`` option. Bug fixes --------- * Fixes some bugs in *clone_from* and *show* connected with the synchronisation directory. Documentation ------------- * Adds a *Developing and Contributing* section. * Updates tips on using virtual environment. Technical changes ----------------- * Adds *pyproject.toml* instead of *setup.py*, as is required by modern Python packaging. * Updates tests to run for Python 3.13. * Linted out many unused imports in test files. * This is the last release tested with Python 3.6. Community --------- This release is mostly due to [statzitz](https://github.com/statzitz), who updated tests to work with the recent Python version (a serious bug which prevented many users to use the tool; thanks to complaints from AUR users as well) and added many documentation improvements. Also thanks to [Lin, Yong Xiang](https://github.com/r888800009) for several bug reports and useful discussions. Today the project has 39 stars on GitHub and 3 contributors, which looks rather popular for me and inspires its better support and development. Publicity --------- *yarsync* was presented in August 2024 at a conference of Python developers in high energy physics, PyHEP.dev 2024. A `video `_ was recorded and published on YouTube (link added to the README). =========================== YARsync 0.2.1 =========================== YARsync patch release 0.2.1 was done on 28 March 2023. * Adds ``--version`` command. * Improves diagnostic messages. * Improves documentation. * Fixes tests. =============== YARsync 0.2 =============== YARsync v0.2 was released on the 8th of March, 2023. Its main features were synchronization, commit limit and cloning. Synchronization information is now stored in the directory *.ys/sync/*. It contains information on the most recent synchronized commits for each known repository. This information is transferred between replicas during ``pull``, ``push`` and ``clone``. This allows ``yarsync`` repositories to better support the 3-2-1 backup rule. To convert an old synchronization file to the new directory format, from the working directory one can use cat .ys/sync.txt && mkdir .ys/sync && touch .ys/sync/$(cat .ys/sync.txt|sed 's/,/_/g').txt && rm .ys/sync.txt To properly support synchronization information, now each repository must have a unique name. The name is no longer automatically deduced from the host name, but contained in *.ys/repo_.txt*. In particular, nameless repositories on external drives cannot be mixed with nameless local repositories. One can set repository name with ``init`` (this command does not affect existing files and is always safe). A user of ``yarsync`` was concerned about the fact that ``rsync`` does not work well with millions of files, and proposed automatically removing old commits. To achieve that, *commit limit* was introduced. It can be set using an option *limit* of ``commit``. When there appears more than that limit, older commits are removed during ``commit``. ``pull`` and ``push`` don't check whether destination has commits missing on source if the local repository has commit limits (it makes a repository with commit limit more like a central repository). Bug fixes --------- * *--no-inc-recursion* is always active for ``pull`` and ``push``. Fixes a bug when ``pull`` *--new* retransferred files already present in commits. * ``pull`` *--new* disables automatic checkout of commits after merge. This prevents deletion of uncommitted files in the working directory (they should be preserved when using *--new*). Improvements ------------ * ``commit`` adds an option *--limit*. ``status`` shows the commit limit (if present). Commit limits are logged (during commit). * ``init`` prompts for input when no repository name on the command line is given. * ``status`` no longer outputs group and owner changes. This information is ignored by ``yarsync`` and considered noise. Set proper user and group for all files when needed. * Improves output in case of errors. * ``pull`` changes: * *--new* allows local repository to have uncommitted changes. * *--new* allows local or remote commits to be missing. * ``pull`` and ``push`` changes: * Improves output for ``pull`` and ``push``. All files for commits that are transferred as a whole (that is new ones) are being output on a single line (that commit name). This makes output more focused on the actual changes in the working directory and on existing commits (if they contained changes). * ``yarsync`` no longer updates user and group ids for ``pull`` and ``push`` (and indirectly ``clone``). This allows one to have different user and group ids on different machines and storage drives, ignoring this metadata. yarsync repositories are supposed to contain data belonging to one user. * If local repository has a commit limit, destination can have commits missing on source. Backward incompatible changes ----------------------------- * ``clone`` command and interface changed. ``clone`` allows copying to a remote. New repository name must be provided explicitly. Cloning from inside a repository with *rsync-filter* is allowed. * Turns off ``pull/push`` *--overwrite* (``rsync`` *--ignore-existing*) functionality. Waiting for https://github.com/WayneD/rsync/issues/357 to be fixed. * Repositories are not checked for changes in the working directory for ``push`` or ``pull`` if *--force* option is given. * Name for each repository is required (to assist synchronization). * Repository name is no longer stored in *repository.txt*, but in *repo_.txt*. This allows ``yarsync`` to know remote repository names from listing their configuration files. Technical changes ----------------- Documentation has been moved to Read the Docs. * ``yarsync`` is tested for Python 3.11. * ``yarsync`` development classifier on PyPI becomes "5 - Production/Stable". * Adds *.gitattributes* (to log revisions of functions). * Tests improvements: * Adds *helpers.py* (for cloning test repositories). * Fixes hardlink fixtures. * Implements ``init`` *--merge* option. It is not tested and shall be added in the next release. * *_print_command* accepts lists and properly escapes commands with spaces. String and list representations of commands are no longer needed. * *_commit* method accepts arguments explicitly. * Adds *_Config* and *_Sync* helper classes. * Documentation improvements: * Adds a how-to for synchronizing repositories after adding external data to both of them (see details section). * Documentation uses Sphinx. Needs fixes for pdf version. Test coverage is 79% (253/1224 missing/total). Publication ----------- ``yarsync`` v0.1 was packaged for Arch Linux, Debian and PyPI (and will be updated for v0.2). A talk on ``yarsync`` was made at the Winter seminar of the Physics Institute of the RWTH Aachen University in Montafon on February 2023. The program was announced on the ``rsync`` mailing list, published on Arch Wiki and Arch Forum, and in several Russian programming Telegram chats. =========================== YARsync 0.1.1+deb =========================== YARsync patch release 0.1.1+deb was done on 6 July 2022. * Fixes manual for whatis (lexgrog) parsing. * Documentation improvements. Adds Installation, Documentation and Thanks sections to README. =========================== YARsync 0.1.1 =========================== YARsync patch release 0.1.1 was made on 30 June 2022. It adds a manual page, improves output and supports Python 3.6. Improvements ------------ Tested and works for Python 3.6. Improves output handling in commit (allows verbosity settings). rsync always outputs error messages. Bug fixes --------- pull and push print output correctly. ======================= YARsync release 0.1 ======================= The first tagged release YARsync v0.1 was made on 21st-23rd June 2022. The program works with Python 3.7, 3.8, 3.9, 3.10 and PyPy 3. Test coverage is 76% (209/889 missing to total). yarsync-0.3.1/README.rst000066400000000000000000000212151477154272500147110ustar00rootroot00000000000000======= YARsync ======= Yet Another Rsync is a file synchronization and backup tool. It can be used to synchronize data between different hosts or locally (for example, to a backup drive). It provides a familiar ``git`` command interface while working with files. YARsync is a Free Software project covered by the GNU General Public License version 3. ------------- Installation ------------- ``yarsync`` is packaged for Debian/Ubuntu. For Arch Linux, install the ``yarsync`` package `from AUR `_. Packages for other distributions are welcome. For an installation `from PyPI `_, run .. code-block:: console pip3 install yarsync If you don't want to install it system-wide (e.g. for testing), see installation in a virtual environment in the `Developing <#developing-and-contributing>`_ section. For macOS Ventura the built-in version of ``rsync`` in macOS is 2.6.9, while ``yarsync`` requires a newer one. Run .. code-block:: console brew install rsync pip3 install yarsync If ``rsync: --outbuf=L: unknown option`` occurs, make sure that a new version of rsync has been installed. Since there is no general way to install a manual page for a Python package, one has to do it manually. For example, run as a superuser: .. code-block:: console wget https://github.com/ynikitenko/yarsync/raw/master/docs/yarsync.1 gzip yarsync.1 mv yarsync.1.gz /usr/share/man/man1/ mandb Make sure that the manual path for your system is correct. The command ``mandb`` updates the index caches of manual pages. One can also install the most recent program version `from GitHub `_. It incorporates latest improvements, but at the same time is less stable (new features can be changed or removed). .. code-block:: console git clone https://github.com/ynikitenko/yarsync.git pip3 install -e yarsync This installs the ``yarsync`` executable to *~/.local/bin*, and does not require modifications of ``PYTHONPATH``. After that, one can pull the repository updates without reinstallation. To **uninstall**, run .. code-block:: console pip3 uninstall yarsync and remove the cloned repository. -------------------- Design and features -------------------- ``yarsync`` can be used to manage hierarchies of unchanging files, such as music, books, articles, photographs, etc. Its final goal is to have the same state of files across different computers. It also allows to store backup copies of data and easily copy, update or recover that. ``yarsync`` is distributed There is no central host or repository for ``yarsync``. If different replicas diverge, the program assists the user to merge the repositories manually. efficient The program is run only on user demand, and does not consume system resources constantly. Already transferred files will never be transmitted again. This allows the user to rename or move files or whole directories without any costs, driving constant improvements on the repository. non-intrusive ``yarsync`` does nothing to user data. It has no complicated packing or unpacking. All user data and program configuration are stored as usual files in the file system. If one decides to stop using ``yarsync``, they can simply remove the configuration directory at any time. simple ``yarsync`` does not implement complicated file transfer algorithms, but uses an existing, widely accepted and tested tool for that. User configuration is stored in simple text files, and repository snapshots are usual directories, which can be modified, copied or browsed from a file manager. All standard command line tools can be used in the repository, to assist its recovery or to allow any non-standard operations (for the users who understand what they do). Read the ``yarsync`` documentation to understand its (simple) design. safe ``yarsync`` does its best to preserve user data. It always allows one to see what will be done before any actual modifications (*--dry-run*). It is its advantage compared to continous synchronization tools, that may be dangerous if local repository gets corrupt (e.g. encrypted by a trojan). Removed files are stored in older commits (until the user explicitly removes those). .. If a file gets corrupt, it will not be transferred by default, but when the user chooses to *pull --backup*, any diverged files will be visible (with their different versions preserved). --------- Commands --------- :: checkout clone commit diff init log pull push remote show status See ``yarsync --help`` for full command descriptions and options. ---------------------------- Requirements and limitations ---------------------------- ``yarsync`` is a Python wrapper (available for ``Python>=3.6``) around ``rsync`` and requires a file system with **hard links**. Since these are very common tools, this means that it can easily run on any UNIX-like system. Moreover, ``yarsync`` is not required to be installed on the remote host: it is sufficient for ``rsync`` to be installed there. In particular, ``rsync`` can be found: * installed on most GNU/Linux distributions, * installed on `Mac OS `_, * can be installed on `Windows `_. ``yarsync`` runs successfully on Linux. Please report to us if you have problems (or success) running it on your system. ------- Safety ------- ``yarsync`` has been used by the author for several years without problems and is tested. However, any data synchronization may lead to data loss, and it is recommended to have several data copies and always do a *--dry-run* (*-n*) first before the actual transfer. ------------- Documentation ------------- For the complete documentation, read the installed or online `manual `_. A 10-minute `video `_ with motivation, implementation ideas and overview of the tool (and 6 minutes more for questions) was recorded during a conference in 2024. For more in-depth topics or alternatives, see `details `_. On the repository github, `release notes `_ can be found. On github pages there is the manual for `yarsync 0.1 `_. An article in Russian that deals more with ``yarsync`` internals was posted on `Habr `_. --------------------------- Developing and contributing --------------------------- You can use a virtual environment in order to avoid messing with your system while working on ``yarsync``: .. code-block:: console python3 -m venv ~/.venv/yarsync_dev source ~/.venv/yarsync_dev/bin/activate # download a clean repository or use the existing one with your changes mkdir tmp && cd tmp git clone https://github.com/ynikitenko/yarsync To build and then install ``yarsync``, run the following commands from the root of the repository: .. code-block:: console cd yarsync pip install -r requirements.txt pip install . Please make sure to run the tests and ensure you haven't broken anything before submitting a pull request. .. code-block:: console pytest # Or to increase verbosity level # pytest -vvv You can run tests on all supported Python versions by simply running ``tox`` in your virtual environment. Make sure to have installed some supported Python versions beforehand (at least two for ``tox`` to be useful). .. code-block:: console tox After all tests you can remove the created directories or leave them for future tests. ------ Thanks ------ A good number of people have contributed to the improvement of this software. I'd like to thank (in most recent order): *statzitz* for great help with updating tests for release *v0.3*, documentation and configuration, Yong Xiang Lin for several bug reports and useful discussions, Arch Linux users for their notifications and improvements of my PKGBUILD, Nilson Silva for packaging ``yarsync`` for Debian, Mikhail Zelenyy from MIPT NPM for the explanation of Python `entry points `_, Jason Ryan and Matthew T Hoare for the inspiration to create a package for Arch, Scimmia for a comprehensive review and suggestions for my PKGBUILD, Open Data Russia chat for discussions about backup safety, Habr users and editors, and, finally, to the creators and developers of ``git`` and ``rsync``. yarsync-0.3.1/docs/000077500000000000000000000000001477154272500141515ustar00rootroot00000000000000yarsync-0.3.1/docs/Makefile000066400000000000000000000015561477154272500156200ustar00rootroot00000000000000# Minimal makefile for Sphinx documentation # # You can set these variables from the command line, and also # from the environment for the first two. SPHINXOPTS ?= SPHINXBUILD ?= sphinx-build SOURCEDIR = source BUILDDIR = build # Put it first so that "make" without argument is like "make help". help: @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) # to see a man page on the fly, use # cat source/yarsync.1.md | sed 's/^##/#/g' | pandoc -s -t man | /usr/bin/man -l - man: @cat source/yarsync.1.md | grep -v '# YARsync manual' | sed 's/^##/#/g' \ | pandoc -s -t man > yarsync.1 .PHONY: help Makefile man # Catch-all target: route all unknown targets to Sphinx using the new # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). %: Makefile @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) yarsync-0.3.1/docs/habr.md000066400000000000000000001316431477154272500154170ustar00rootroot00000000000000Синхронизируем данные с yarsync =============================== Привет, Хабр! *yarsync* - Yet Another Rsync - предназначен для синхронизации данных между несколькими устройствами, более точно - между файловыми системами в Unix-подобных средах. *yarsync* обладает интерфейсом, похожим на *git*, и является Python-обёрткой вокруг программы *rsync*. Программа доступна под свободной лицензией GPL v3.0 на [github](https://github.com/ynikitenko/yarsync) (я автор). *yarsync* работает там, где есть Питон и *rsync*. Данные могут синхронизироваться локально или между разными компьютерами (в таком случае на удалённой машине также должен быть установлен *rsync*). Кроме того, файловые системы должны поддерживать жёсткие ссылки (hard links). Популярные системы, [поддерживающие жёсткие ссылки](https://github.com/ynikitenko/yarsync#hard-links) - ext2-ext4, HFS+, а также NTFS. Не поддерживают жёсткие ссылки FAT, exFAT (часто используемые на флеш-накопителях). Говоря простыми словами, допустим, что у вас есть компьютеры дома и на даче. У вас есть папка с книгами и статьями по программированию, которые вы собирали долгие годы, и которой регулярно пользуетесь (её копиями на разных машинах). Вы хотите, чтобы эти копии были одинаковы - то есть в идеале чтобы можно было работать с данными на разных компьютерах (добавлять новые статьи, удалять ненужные, переименовывать и перемещать файлы и папки), а затем эти изменения легко переносились на другие копии. Это и делает yarsync, отслеживая изменения и позволяя эффективно синхронизировать данные через доступный сервер или внешний накопитель (жёсткий диск). Прежде чем говорить о дизайне, стоит обозначить цели *yarsync*, а ими являются: - удобство пользователя. Оно не только включает в себя привычный интерфейс, но и снижает риск ошибок. - производительность. *rsync* использует эффективный алгоритм для передачи данных (передавая только различия). Уже переданные файлы не передаются вновь (даже если были перемещены или переименованы). Программа вызывается когда необходимо и не занимает постоянно процессор и память. - надёжность. Первый выпуск *rsync* был в [1996 году](https://ru.wikipedia.org/wiki/Rsync), и с тех пор она является практически стандартной программой для синхронизации, то есть проверена множеством пользователей (к сожалению, мне не удалось найти даже примерное их число), и поддерживается и развивается по сегодняшний день (последняя версия вышла пять дней назад). - прозрачность для системы. Служебная информация (коммиты) является обычными файлами, не требующими упаковки и распаковки. Далее читателю предлагается оценить, насколько близко эти цели оказались достигнуты. Начало работы ------------- Скопируйте программу: $ git clone https://github.com/ynikitenko/yarsync Внутри репозитория находится подпапка *yarsync*, в ней питоновский модуль *yarsync.py* и исполняемый файл *yarsync* (вызывающий первый). Нужно добавить путь к этим файлам, например, в *~/.bashrc*: export PATH=$PATH:~/yarsync/yarsync export PYTHONPATH=$PYTHONPATH:~/yarsync Если программа находится и работает, то для изучения простых команд создадим новую директорию me@myhost$ mkdir ~/tmp me@myhost$ cd ~/tmp Инициализируем репозиторий: me@myhost$ yarsync init # init configuration for myhost mkdir .ys create configuration file .ys/config.ini Как видно из выхода программы, при инициализации создаётся скрытая директория *.ys* с конфигурационным файлом *config.ini* (он пока пуст, и подробнее мы обсудим его ниже). Все служебные данные будут находиться только внутри директории *.ys* (также при частом использовании я использую для *yarsync* псевдоним (alias) *ys*). Для инициализации репозитория с существующими данными можно вызвать *yarsync init* внутри нужной директории. Если репозиторий уже был инициализирован, то эта команда остаётся безопасной (то есть ничего не делает). Различные состояния репозитория фиксируются в коммитах, которые создаются с помощью команды *commit*: me@myhost$ yarsync commit rsync -a --link-dest=../../.. --exclude=/.ys --exclude=/.ys/* /home/me/tmp/ /home/me/tmp/.ys/commits/1650462990_tmp mv /home/me/tmp/.ys/commits/1650462990_tmp /home/me/tmp/.ys/commits/1650462990 mkdir /home/me/tmp/.ys/logs commit 1650462990 created When: Wed, 20 Apr 2022 16:56:30 MSK Where: me@myhost Как можно видеть, вывод программы в данный момент довольно подробный (часто публикуются полные команды *rsync*). Сначала создаётся временный коммит в директории *.ys/commits*. С помощью команды *rsync* в директории коммита создаются жёсткие ссылки файлов из рабочей директории (то есть всех файлов кроме служебной директории). В данный момент, поскольку файлов нет, то коммит будет пустым. Если всё прошло удачно, то коммит перемещается в директорию без суффикса *_tmp*. Кроме того, создаётся директория *.ys/logs*, куда записывается описание коммита (в нашем случае *.ys/logs/1650462990.txt* содержит время, пользователя и машину, где был создан коммит). Название коммита - это число секунд с начала эпохи (Unix-время начинается 1 января 1970, 00:00:00 UTC), получаемое с помощью функции Python [time.time](https://docs.python.org/3/library/time.html#time.time). Это универсальное время, то есть названия коммитов будут упорядочены вне зависимости от часового пояса на различных машинах. Также можно создать описание коммита с помощью опции *-m*: $ yarsync commit -m 'second commit' ... Давайте добавим в репозиторий новый файл: $ touch example.txt $ yarsync status rsync -aun --delete -i --exclude=/.ys --exclude=/.ys/* --outbuf=L /home/me/tmp/ /home/me/tmp/.ys/commits/1650463725 Changed since head commit: .d..t...... ./ >f+++++++++ example.txt No syncronization information found. Вывод программы даётся в формате опции *rsync -i* ([--itemize-changes](https://linux.die.net/man/1/rsync)). Первая строка обозначает, что корневая директория ('d') не изменилась ('.'), а точнее изменилась только её временная метка ('t'). На следующей строке мы видим, что со времени последнего ("головного") коммита (более по-русски будет сказать "снимка") появился наш новый файл. Как и ранее, директория *.ys* не принимается во внимание (*--exclude*), и никаких изменений при запросе статуса не происходит (*-n, --dry-run*). Также в директории *.ys* можно создать файл *rsync-filter* с синтаксисом фильтров *rsync* (он очень богатый, смотрите его руководство). Пример его содержания (комментарии разрешены): # data can be copied separately - /data В данном случае мы исключаем папку tmp/data из репозитория и игнорируем или синхронизируем её отдельно. Выделение отдельных подрепозиториев удобно в организации работы, но на носителях резервных копий удобнее линейная структура, чтобы можно было не искать вложения при синхронизации (хотя это можно решить, когда мы будем обсуждать работу с несколькими репозиториями одновременно). Существующие коммиты можно посмотреть командой $ yarsync log Полный список команд можно получить с помощью *yarsync --help*. Кроме того, директория *.ys* может находиться вне синхронизируемого каталога (с помощью опций *--config-dir* и *--root-dir*). [Моя первая публикация на Хабре](https://habr.com/ru/post/425259/) была о статических страницах сайта, которые поддерживали контроль версий с помощью чистого (bare) git-репозитория в отдельной папке. Содержимое коммитов - это файлы и папки корневого репозитория (на момент их создания). Их можно просматривать с помощью обычного менеджера файлов или вызывать в них из терминала стандартные команды вроде *find*. Если вы удалили файл в рабочей директории, но потом решили его восстановить, то можете скопировать его (создать жёсткую ссылку) из коммита, где он был. Если же, напротив, вам не нужны старые файлы, то вы можете свободно удалить старые коммиты *rm -rf .ys/commits/*, при этом ни рабочая директория, ни инфраструктура *yarsync* не пострадают. Синхронизация ------------- Информация о репозиториях находится в файле *.ys/config.ini*. Создадим простую конфигурацию для копии наших данных в папке *~/tmp2*: [tmp2] # empty host means local host host = path = /home/me/tmp2 Если мы попробуем скопировать репозиторий туда с помощью *yarsync push tmp2*, то получим ошибку. Программа проверяет, что назначение (destination) действительно является корректным репозиторием (что не так, поскольку мы его ещё не создали). Также перед отправкой данных необходимо сохранить (commit) локальные изменения. Если мы уверены, что папка *~/tmp2* пуста или не существует, то мы можем клонировать туда наш репозиторий с помощью ключа *-f, --force*: $ yarsync push -f tmp2 # rsync -avHP --delete-after --include=/.ys/commits --include=/.ys/logs --exclude=/.ys/* /home/me/tmp/ /home/me/tmp2/ При переходе в ту папку, мы увидим, что она идентична нашему первому репозиторию, как и коммиты и их история (проверьте с помощью *yarsync log*). Хотя коммиты и логи копируются полностью, конфигурационные файлы (*config.ini, rsync-filter* и другие в папке *.ys*) не копируются, то есть независимы друг от друга. Ключ *rsync -H* означает связывание жёстких ссылок в *назначении* (в нашем случае *tmp2*) таким же образом, как и в *источнике*. Если мы посмотрим индексные дескрипторы (inodes) файлов *ls -i example.txt* в *tmp* и *tmp2*, то мы увидим, что они отличаются - при этом внутри одного клона они совпадают в коммитах и рабочей директории. Ключ *--delete-after* требует, чтобы перед реальными изменениями *rsync* просканировал все файлы, то есть собрал все существующие жёсткие ссылки. Если между двумя репозиториями есть синхронизированный коммит, и в одном из них файл в последующем коммите был перемещён, то *rsync* увидит совпадающий дескриптор и не будет вновь пересылать существующий файл. Если мы в процессе работы создали новый коммит в *tmp2*, то мы можем также перенести изменения обратно в *tmp*: $ cd ~/tmp $ yarsync pull tmp2 Разумеется, в конфигурационном файле в нескольких секциях могут быть настройки для большего числа репозиториев (*tmp3* и пр.). Полный синтаксис файла указан в модуле [configparser](https://docs.python.org/3/library/configparser.html#supported-ini-file-structure) стандартной библиотеки. Часто бывает, что путь к репозиторию меняется. Например, если мы копируем данные по сети, то DHCP может выдать новый ip-адрес компьютера, а при подключении жёсткого диска на лету может быть сгенерирован уникальный путь в */run/media*. В таком случае можно использовать в конфигурации переменную окружения: [my_drive] path = $MYDRIVE/programming и если мы зададим переменную *MYDRIVE*, то путь будет определён корректно: $ export MYDRIVE=/run/media/my_drive $ yarsync push my_drive Кроме того, при синхронизации с другим репозиторием об этом сохраняется информация в *.ys/sync.txt*. В нашем случае в этом файле будет 1650468609,tmp2 то есть номер коммита и название другого репозитория. Также информация о синхронизации будет отображаться в командах *status* и *log*: $ yarsync log commit 1650468609 <-> tmp2 When: Wed, 20 Apr 2022 18:30:09 MSK ... Слияние версий -------------- -------------- Когда у нас есть несколько реплик данных и мы регулярно переносим изменения из одной в другую (либо если одной из них мы пользуемся только для доступа к файлам, а все изменения производим в другой), то мы можем довольно долго так работать без каких-либо сложностей. Однако в какой-то момент может возникнуть ситуация, что наши истории коммитов разошлись (мы добавили новые файлы и в одну, и в другую реплику), и нам необходимо установить, какое состояние рабочей директории должно считаться корректным для всех репозиториев. В этом случае мы должны провести слияние версий (merge). Допустим, наши репозитории в *tmp* и *tmp2* синхронизированы. Создадим новые коммиты в каждой из реплик: $ cd ~/tmp2 $ touch B $ yarsync commit -m 'Add B' ... commit 1650480050 created и аналогично добавим файл 'A' в *tmp*. Теперь, когда мы попытаемся отправить данные из *tmp* в *tmp2*, то программа зафиксирует различающиеся коммиты и выдаст ошибку: $ yarsync push tmp2 Nothing to commit, working directory clean. # local repository is 1 commits ahead of tmp2 # rsync -avHP --delete-after --include=/.ys/commits --include=/.ys/logs --exclude=/.ys/* /home/me/tmp/ /home/me/tmp2/ rsync --list-only /home/me/tmp2/.ys/commits/ ! destination has commits missing on source: 1650480050, synchronize these commits first: 1) pull missing commits with 'pull --new', 2) push if these commits were successfully merged, or 2') optionally checkout, 3') manually update the working directory to the desired state, commit and push, 2'') --force local state to remote (removing all commits and logs missing on the destination). Как мы видим, предлагается несколько вариантов действий. Самое простое, если мы уверены, что в *tmp2* не актуальные данные - записать туда состояние репозитория *tmp*, удалив все новые файлы с помощью опции *push -f*. Команда *push* проверяет, что в источнике все изменения были сохранены в коммит. В общем случае это невозможно сделать в удалённом репозитории (*rsync* не может копировать данные между двумя удалёнными машинами), поэтому изменения на другой машине могут быть не сохранены в её локальный коммит, и поэтому у команды *pull* опция *-f* отсутствует. Мы требуем, чтобы синхронизировались только сохранённые состояния - за исключением, о котором ниже. У команды *pull* есть опция *--new* (которой нет у *push*), которая переносит только новые файлы с удалённого репозитория (не уничтожая локальные файлы, которые там отсутствуют): $ yarsync pull --new tmp2 ... rsync --list-only /home/me/tmp2/.ys/commits/ # rsync -avHP --include=/.ys/commits --include=/.ys/logs --exclude=/.ys/* /home/me/tmp2/ /home/me/tmp/ merge 1650480038 and 1650480050 manually and commit (most recent common commit is 1650468609) Как мы видим, в команде *rsync* здесь отсутствует флаг *delete*. Программа изучает коммиты в удалённом источнике, находит последний общий коммит (если он есть) и указывает последние коммиты в локальном и удалённом репозиториях, которые нужно синхронизировать. В данный момент нам могут помочь несколько других команд: $ yarsync diff 1650480038 1650468609 ... >f+++++++++ A показывает, что чтобы перейти от общего коммита 1650468609 к 38-му (надеюсь, сокращение понятно), нужно добавить файл А. Так же мы можем посмотреть, что изменилось на другом репозитории с общего коммита (поскольку последний удалённый коммит уже скопирован локально). В данном случае это тривиально, но если вы в последний раз синхронизировали другую машину год назад, то эта информация будет очень полезна. Кроме того, поскольку наши коммиты хранятся в файловой системе, мы можем просто сравнить их с помощью *diff -r* (хотя придётся писать к ним пути), поэтому эта команда скорее для удобства. В данный момент в локальном репозитории находится объединение рабочих папок обоих реплик: $ yarsync status ... Changed since head commit: >f+++++++++ A Merging 1650480038 and 1650480050 (most recent common commit 1650468609). # local repository is 2 commits ahead of tmp2 $ ls A B example.txt новым файлом считается *A*, поскольку последний коммит (головной) был сделан в *tmp2* (при этом в рабочей директории находятся оба файла). Информация об объединении находится в файле *.ys/MERGE.txt*, который создаётся автоматически. Если мы сейчас сделаем *commit*, то этот файл будет удалён, а информация о слиянии добавится в log. Но представим ситуацию, когда мы действительно пять минут назад создали новый небольшой коммит в *tmp2*, однако до этого очень долго работали в *tmp*, удаляли и переименовывали многие файлы в рабочей директории, и теперь вместе с состоянием *tmp2* мы вернули все эти файлы обратно (вместе с уже переименованными). В таком случае мы можем восстановить более актуальный коммит с помощью команды *checkout*: $ yarsync checkout 1650480038 rsync -au --delete -i --exclude=/.ys --exclude=/.ys/* --outbuf=L /home/me/tmp/.ys/commits/1650480038/ /home/me/tmp *deleting B .d..t...... ./ Когда мы выполняем *checkout*, то головным (head) коммитом становится не самый последний, а тот, который мы загрузили. В частности, если мы ничего не меняли, то команда *status* будет показывать разницу с загруженным коммитом (а не последним), с добавлением строки Detached HEAD (see 'yarsync log' for more recent commits) Информация о головном коммите (если он не самый последний) сохраняется в *.ys/HEAD.txt*. Если мы решим, что файл *B* нам больше не нужен, то мы можем прямо сейчас сделать коммит и отправить итоговую версию в *tmp2*: $ yarsync commit -m 'Merge.' $ yarsync push tmp2 Во время коммита и MERGE.txt, и HEAD.txt будут автоматически удалены, и новый коммит будет считаться корректным состоянием репозитория. Поскольку все коммиты с *tmp2* уже были скопированы локально, то сложностей с *push* уже не возникнет. Поскольку старые коммиты могут произвольно удаляться, то может возникнуть ситуация, что в другом репозитории может сохраниться старый коммит, но при этом дальнейшая история будет совпадать с локальной копией (и последний коммит там будет среди локальных). В таком случае можно будет либо удалить тот старый коммит, либо перенести все коммиты *pull --new*, и в этом случае локальный репозиторий автоматически загрузит корректный (самый последний локальный) коммит. Возможно, что с развитием программы появятся новые эвристики, но в общем случае слияние состояний может проводиться только вручную - по описанному выше алгоритму. Реализация ---------- Написание исполняемой программы на Питоне не похоже на создание обычного модуля, а работа с командами не похожа на создание объектов с состояниями. Как я указал выше, есть отдельный питоновский модуль *yarsync.py*. Поскольку не хотелось бы, чтобы пользователь был вынужден набирать лишние три символа в конце команды, то потребовалось создавать отдельный исполняемый файл *yarsync* (вызывающий первый). При этом питоновский модуль также нужен: его очень удобно тестировать с помощью *pytest*. В одной из ранних версий я пытался зафиксировать опции *rsync* в отдельном объекте, но в итоге у меня остался только класс *YARsync*. Большинство его методов приватные. Более того, в отличие от привычных объектов, большинство его методов могут быть недоступны: если мы вызываем *yarsync status*, то происходит инициализация с данными аргументами командной строки, и метод *_pull_push* (они объединены, поскольку отличаются для *rsync* только порядком последних аргументов) мы вызвать просто не сможем, потому что неизвестен удалённый репозиторий. Огромную работу с всевозможными аргументами командной строки делает стандартный модуль *argparse*, а *rsync* вызывается с помощью *subprocess*. Безопасность ------------ Обычно парсинг конфигурационных файлов (тем более с заменой переменных окружения) может быть небезопасным. Если директория *.ys* отсутствует в текущем каталоге, то она ищется в его родительских каталогах (как в git). Я вызываю *yarsync* в проверенных директориях, однако в *configparser* ничего не говорится о том, чтобы его использование было небезопасным, поэтому не могу быть уверен, есть ли здесь уязвимость или нет. Возможно, читатели подскажут на этот счёт. Также к безопасности я отношу возможность удалить личные данные из репозитория (напомню, что у нас очень много коммитов, то есть недостаточно удалить файл из рабочей директории). Жёсткие ссылки здесь скорее в плюс, поскольку мы можем просто вызвать *shred* для нашего файла, и все его дубликаты будут одновременно стёрты. Затем мы можем удалить файлы с одним путём из всех коммитов, а если сомневаемся, не был ли он в какой-то момент перемещён, то можем найти его по иноду *find -inum*. Поскольку удалённые коммиты всё равно будут содержать этот файл, то нужно будет также стереть его там с помощью *push -f*. Если вы синхронизируете личные файлы, то может быть важным их шифрование. Я пользуюсь стандартными зашифрованными разделами [LUKS](https://ru.wikipedia.org/wiki/LUKS), которые после открытия прозрачны для любой синхронизации. Также можно шифровать отдельные файлы с помощью [EncFS](https://github.com/vgough/encfs), основанной на FUSE, и она поддерживает жёсткие ссылки (кроме режима *paranoia*), то есть в принципе может использоваться с *yarsync*. Главным аспектом безопасности я считаю сохранность данных и всегда вызываю *--dry-run* перед настоящими *push* и *pull*. $ yarsync push -n dest покажет, что именно будет перенесено на *dest*, не делая физических изменений. Если существующий файл был изменён (в результате ошибки или сбоя файловой системы), то это тоже скорее всего будет отражено. Для более надёжной проверки у *rsync* есть опция *checksum* (гораздо более длительная, и мне она пока не потребовалась). Альтернативы ------------ Синхронизация данных, видимо, является важной задачей для программистов и системных администраторов, потому что только перечисление всех известных инструментов потребовало бы отдельной статьи. Однако дать небольшой обзор альтернатив я считаю уместным (здесь [чуть более полный список](https://github.com/ynikitenko/yarsync#alternatives) инструментов, на которые я обратил внимание), в том числе чтобы показать их различия с *yarsync*. Прежде всего, для синхронизации ценных изменяемых текстовых файлов и программ используются системы контроля версий. Я многие годы пользуюсь *git*, к которому у меня нет ни малейших претензий. Системы контроля версий предлагают значительно больше, чем просто синхронизацию, но положа руку на сердце, почти всегда я набираю *git push* в первую очередь для того, чтобы моя последняя работа не пропала. В свою очередь, если я скачал статью из интернета, то мне удобно, что она у меня сохранена, но её ценность для меня не критична и её версии мне не нужны (*yarsync* их не поддерживает). В отличие от системы контроля версий с общепризнанным лидером, синхронизация обычных файлов предлагает значительно больше разных инструментов. К инструментам **непрерывной синхронизации** относятся Dropbox, Яндекс.Диск и многие другие облачные сервисы. Ранее я ими пользовался, но лично мне не очень нравится, что какой-то сервис постоянно занимает память и процессор. Такие программы, если и доступны под Linux, могут быть с закрытым кодом, что тоже минус (хотя и побудили меня лучше изучить изоляцию процессов и SELinux). Одним из главных минусов для меня явилось то, что бесплатный объём памяти в них гораздо меньше, чем возможный объём жёсткого диска, который я могу купить. Платный гугл-диск был мне удобен (хотя я уже не синхронизировал его с компьютером), но, по иzвестным причинам, в последнее время я уже физически не мог за него заплатить с российской карты. Кроме того, поскольку я не постоянно нахожусь в сети, то у меня не было уверенности, что при подключении синхронизация произойдёт корректно и файлы не будут утеряны. Понятными плюсами облачных сервисов является то, что не обязательно иметь с собой носитель информации (или поддерживать свой сервер), а также то, что синхронизация может происходить с разными устройствами (например, планшетом на Android). Из интересного отмечу, что, например, Dropbox [использует алгоритм rsync](https://ru.wikipedia.org/wiki/Rsync) через библиотеку *librsync*. Есть также много инструментов непрерывной синхронизации с открытым кодом и позволяющих использовать свой сервер, которые лишены части из вышеуказанных недостатков (но лично для меня внешний жёсткий диск всё равно гораздо дешевле оплаты места на сервере). Существует множество программ для **бэкапа** и **архивирования**. Сама тема [резервного копирования](https://ru.wikipedia.org/wiki/Резервное_копирование) очень широка. Помимо сохранения, в ней также важен аспект *восстановления* данных из бэкапа. К сожалению, при изучении таких программ я периодически видел посты (issues) о том, как архив был повреждён, и в этом смысле прозрачные коммиты с жёсткими ссылками не требуют никакого восстановления (если *yarsync* недоступна, то его можно сделать вручную). Разумеется, это не значит, что такие программы всегда хуже и не будут подходить для вашего случая. Также я длительное время пользовался программой [git-annex](https://git-annex.branchable.com/). Это огромная сложная программа на Haskell, использующая *git*, со множеством возможностей, про которую автор пишет, что она не является ни бэкапом, ни архивированием (но *git-annex assistant* позволяет синхронизацию файлов между компьютерами на OSX и Linux). К сожалению, мне не удалось хорошо в ней разобраться, и в какой-то момент одна из копий моих данных оказалась в непригодном состоянии (но автор программы оперативно помог на её форуме). Постепенно разбираясь в *git-annex*, я обнаружил, что он не сохраняет временные метки файлов, что для меня было неприемлемо, поскольку информация о том, создал я файл год назад или 10 лет назад, для меня важна. После этого я перешёл к классическому *rsync* и созданию удобной конфигурации для него. Итоговые замечания ------------------ Для одновременной синхронизации нескольких репозиториев очень удобна программа [myrepos](https://myrepos.branchable.com/) (от создателя *git-annex*). Например, если вы хотите отправить новые коммиты в нескольких репозиториях на удалённую машину, то можете это сделать одной командой *mr push *. Она работает с *git*, но поскольку интерфейс *yarsync* очень близок к нему, то конфигурацию *.mrconfig* очень легко перенастроить. Если представить, что *yarsync* предстоит ещё долгий путь, то программа находится ближе к его началу. На данный момент она используется только одним человеком, и хотя я публикую её в открытом доступе, мне известно множество её недостатков: вывод осуществляется простым текстом, без цветов и выделения, описание коммита можно передать только через опцию *-m* (текстовый редактор не вызывается), сам вывод явно будет улучшен, а самое страшное, что я до сих пор не понял, должны ли команды rsync, когда они есть, печататься с начала строки или после символов "# ". Для полного использования функционала нужны жёсткие ссылки, но возможно, что иногда это требование можно ослабить. Известно, что хотя *rsync* относится в первую очередь к Linux и близким системам, но он может использоваться и в Windows, что я также не пробовал, поскольку не пользуюсь этой ОС. Несмотря на обозначенные недостатки, лично для меня оказалась удобна и понятна работа *yarsync*. Главные свои задачи: *commit, push* и *pull* она выполняет надёжно. При этом я не могу сказать, что ей доверяю, потому что всегда сначала запускаю *status* и пробую синхронизацию с ключом *-n*. В общем же случае я не доверяю ни одной программе, и единственным (достаточно) надёжным способом для меня является иметь не меньше трёх копий данных (это даже формализовано в виде [правила 3-2-1](https://habr.com/ru/company/veeam/blog/188544/)). Копирование данных на несколько машин / носителей я использую очень давно, однако если их изменения возможны в нескольких репозиториях, то без коммитов это превращается в хаос (либо запрещается обновление некоторых копий). Сейчас почти все данные, которыми я дорожу (будь то музыка, фотографии, книги, статьи), я добавил в репозитории *yarsync*. Я думаю, что *yarsync* близка к философии Unix, потому что использует простые существующие инструменты (*rsync* и файловую систему). Если сообщество разделяет предложенные идеи, то программа быстро преодолеет имеющиеся недостатки. Данная статья написана к конкурсу Хабра [Технотекст 2021](https://contenting.io/2021.html). Девиз конкурса в этом году - слова Станислава Лема «ничто не стареет так быстро, как будущее». Я думаю, что *rsync* подходит под этот девиз, потому что несмотря на появление множества программ с той же целью, они рождаются, стареют и исчезают, а *rsync* так и продолжает существовать и использоваться вот уже больше 25 лет, потому что прошлое стареет гораздо медленнее, чем будущее. yarsync-0.3.1/docs/make.bat000066400000000000000000000014441477154272500155610ustar00rootroot00000000000000@ECHO OFF pushd %~dp0 REM Command file for Sphinx documentation if "%SPHINXBUILD%" == "" ( set SPHINXBUILD=sphinx-build ) set SOURCEDIR=source set BUILDDIR=build %SPHINXBUILD% >NUL 2>NUL if errorlevel 9009 ( echo. echo.The 'sphinx-build' command was not found. Make sure you have Sphinx echo.installed, then set the SPHINXBUILD environment variable to point echo.to the full path of the 'sphinx-build' executable. Alternatively you echo.may add the Sphinx directory to PATH. echo. echo.If you don't have Sphinx installed, grab it from echo.https://www.sphinx-doc.org/ exit /b 1 ) if "%1" == "" goto help %SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% goto end :help %SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% :end popd yarsync-0.3.1/docs/requirements.txt000066400000000000000000000001051477154272500174310ustar00rootroot00000000000000sphinx # to use Markdown with Sphinx myst-parser # sphinx theme furo yarsync-0.3.1/docs/source/000077500000000000000000000000001477154272500154515ustar00rootroot00000000000000yarsync-0.3.1/docs/source/conf.py000066400000000000000000000023551477154272500167550ustar00rootroot00000000000000# Configuration file for the Sphinx documentation builder. # # For the full list of built-in configuration values, see the documentation: # https://www.sphinx-doc.org/en/master/usage/configuration.html # -- Project information ----------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information import os import sys # for readthedocs, https://pennyhow.github.io/blog/making-readthedocs/ # Otherwise yarsync module won't be found. sys.path.insert(0, os.path.abspath('../../')) from yarsync.version import __version__ project = 'YARsync' copyright = '2021-2025, Yaroslav Nikitenko' author = 'Yaroslav Nikitenko' release = __version__ # -- General configuration --------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration extensions = [ # to include the manual in Markdown "myst_parser", ] templates_path = ['_templates'] exclude_patterns = [] # -- Options for HTML output ------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output html_theme = 'furo' # html_theme = 'alabaster' html_static_path = ['_static'] yarsync-0.3.1/docs/source/details.rst000066400000000000000000000217101477154272500176310ustar00rootroot00000000000000======== Advanced ======== ----------- Usage tips ----------- Since ``yarsync`` allows using a command interface similar to ``git``, one can synchronize several repositories simultaneously using `myrepos `_. If new data was added to several repositories simultaneously, commit the changes on one of them and synchronize that with the another. ``rsync`` should link the working directory with commits properly. This may fail depending on how you actually copied files (they may have changed attributes). In this case, create new commits in both repositories and manually rename them to be the same. Try to synchronize to see that all is linked properly. For example, when we move photographs from an SD card, we want to have at least two copies of them. It would be more reliable to copy data from the original source to two repositories than to push that from one of them to another (possible errors on the intermediate filesystem increase the risk). Make sure that the two repositories were synchronized beforehand. ------------ Development ------------ Community contributions are very important for free software projects. The best thing for the project on the starting phase is to spread information and create packages for new operating systems. ``yarsync`` was tested on ext4, NFSv4 and SimFS on Arch Linux and CentOS. Tests on other systems would be useful. ---------- Hard links ---------- The file system must support hard links if you plan to use *commits*. Multiple hard links are supported by POSIX-compliant and partially POSIX-compliant operating systems, such as Linux, Android, macOS, and also Windows NT4 and later Windows NT operating systems [`Wikipedia `_]. Notable file systems to **support hard links** include [`hard links `_ and `comparison of file systems `_ from Wikipedia]: * EncFS (an Encrypted Filesystem using FUSE). Note that it doesn't support hard links `when External IV Chaining is enabled `_ (this is enabled by default in paranoia mode, and disabled by default in standard mode). * ext2-ext4. Standard on Linux. Ext4 has a limit of `65000 hard links `_ on a file. * HFS+. Standard on Mac OS. * NTFS. The only Windows file system to support hard links. It has a limit of `1024 hard links `_ on a file. * SquashFS, a compressed read-only file system for Linux. Hard links are **not supported** on: * FAT, exFAT. These are used on many flash drives. * Joliet ("CDFS"), ISO 9660. File systems on CDs. The majority of modern file systems support hard links. A full list of `file system capabilities `_ can be found on Wikipedia. One can copy data to file systems without hard links, but this will reduce the functionality of ``yarsync``, and one should take care not to consume too much disk space if accidentally copying files instead of hard linking. ----------------- rsync limitations ----------------- * `Millions of files `_ will be synced very slowly. * ``rsync`` freezes when encountering **too many hard links**. Users report problems for repositories of `200 G `_ or `90 GB `_, with many hard links. For the author's repository with 30 thousand files (160 thousand with commits) and 3 Gb of data ``rsync`` works fine. If you have a large repository and want to copy it with all hard links, it is recommended to create a separate partition (e.g. LVM) and copy the filesystem as a whole. You can also remove some of older backups. * ``rsync`` may create separate files instead of hard linking them. It can be fixed quickly using the `hardlink `_ executable. ------------ Alternatives ------------ `Free software that uses rsync `_ includes: * `Back In Time `_. See previous snapshots using a GUI. * Grsync, graphical interface for rsync. * `LuckyBackup `_. It is written in C++ and is mostly used from a graphical shell. * `rsnapshot `_, a filesystem snapshot utility. ``rsnapshot`` makes it easy to make periodic snapshots of local machines, and remote machines over ssh. Files can be restored by the users who own them, without the root user getting involved. Other syncronization / backup / archiving software: * `casync `_ is a combination of the rsync algorithm and content-addressable storage. It is an efficient way to deliver and update directory trees and large images over the Internet in an HTTP and CDN friendly way. Other systems that use `similar algorithms `_ include `bup `_. * `Duplicity `_ backs directories by producing encrypted tar-format volumes and uploading them to a remote or local file server. ``duplicity`` uses ``librsync`` and is space efficient. It supports many cloud providers. In 2021 ``duplicity`` supports deleted files, full unix permissions, directories, and symbolic links, fifos, and device files, but not hard links. It can be run on Linux, MacOS and Windows (`under Cygwin `_). * `Git-annex `_ manages distributed copies of files using git. This is a very powerful tool written in Haskell. It allows for each file to track the number of backups that contain it and their names, and it allows to plan downloading of a file to the local storage. This is its author's `use case `_: "I have a ton of drives. I have a lot of servers. I live in a cabin on dialup and often have 1 hour on broadband in a week to get everything I need". I tried to learn ``git-annex``, it was `uneasy `_ , and finally I found that it `doesn't preserve timestamps `_ (because ``git`` doesn't) and `permissions `_. If that suits you, there is also a list of specialized `related software `_. ``git-annex`` allows to use many cloud services as `special remotes `_, including all `rclone remotes `_. * `Rclone `_ focuses on cloud and other high latency storage. It supports more than 50 different providers. As of 2021, it doesn't preserve permissions and attributes. Continuous synchronization software: * `gut-sync `_ offers a real-time bi-directional folder synchronization. * `Syncthing `_. A very powerful and developed tool, works on Linux, MacOS, Windows and Android. Mostly uses a GUI (admin panel is managed through a Web interface), but also has a `command line interface `_. * `Unison `_ is a file-synchronization tool for OSX, Unix, and Windows. It allows two replicas of a collection of files and directories to be stored on different hosts (or different disks on the same host), modified separately, and then brought up to date by propagating the changes in each replica to the other (pretty much like other syncronization tools work). * Dropbox, Google Drive, Yandex Disk and many other closed-source tools fall into this cathegory. ArchWiki includes several useful `scripts for rsync `_ and a list of its `graphical front-ends `_. It also has a `list of cloud synchronization clients `_ and a `list of synchronization and backup programs `_. Wikipedia offers a `comparison of file synchronization software `_ and a `comparison of backup software `_. Git-annex has a list of `git-related `_ tools. yarsync-0.3.1/docs/source/index.rst000066400000000000000000000020111477154272500173040ustar00rootroot00000000000000.. YARsync documentation master file, created by sphinx-quickstart on Mon Jan 2 17:35:34 2023. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. *********************************** Welcome to YARsync documentation! *********************************** .. *********************************** *********************************** =================================== =================================== ============ Introduction ============ .. include:: ../../README.rst :start-line: 4 .. toctree:: :caption: Documentation: Manual Advanced
.. raw:: latex \chapter{Manual} .. only:: latex .. include:: yarsync.1.md :parser: myst_parser.sphinx_ .. only:: latex .. include:: details.rst .. # have no idea why it doesn't affect anything :maxdepth: 2 :titlesonly: .. Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` yarsync-0.3.1/docs/source/yarsync.1.md000066400000000000000000000705521477154272500176330ustar00rootroot00000000000000% YARSYNC(1) yarsync 0.3 | YARsync Manual % Written by Yaroslav Nikitenko % March 2025 # NAME yarsync - a file synchronization and backup tool # SYNOPSIS **yarsync** [**-h**] \[**\--config-dir** *DIR*\] \[**\--root-dir** *DIR*\] \[**-q** | **-v**\] *command* \[*args*\] [comment]: # (to see it converted to man, use pandoc yarsync.1.md -s -t man | /usr/bin/man -l -) # DESCRIPTION Yet Another Rsync stores rsync configuration and synchronizes repositories with the interface similar to git. It is *efficient* (files in the repository can be removed and renamed freely without additional transfers), *distributed* (several replicas of the repository can diverge, and in that case a manual merge is supported), *safe* (it takes care to prevent data loss and corruption) and *simple* (see this manual). # QUICK START To create a new repository, enter the directory with its files and type yarsync init This operation is safe and will not affect existing files (including configuration files in an existing repository). Alternatively, run **init** inside an empty directory and add files afterward. To complete the initialization, make a commit: yarsync commit -m "Initial commit" **commit** creates a snapshot of the working directory, which is all files in the repository except **yarsync** configuration and data. This snapshot is very small, because it uses hard links. To check how much your directory size has changed, run **du**(1). Commit name is the number of seconds since the Epoch (integer Unix time). This allows commits to be ordered in time, even for hosts in different zones. Though this works on most Unix systems and Windows, the epoch is platform dependent. After creating a commit, files can be renamed, deleted or added. To see what was changed since the last commit, use **status**. To see the history of existing commits, use **log**. Hard links are excellent at tracking file moves or renames and storing accidentally removed files. Their downside is that if a file gets corrupt, this will apply to all of its copies in local commits. The 3-2-1 backup rule requires to have at least 3 copies of data, so let us add a remote repository \"my\_remote\": yarsync remote add my_remote remote:/path/on/my/remote For local copies we still call the repositories \"remote\", but their paths would be local: yarsync remote add my_drive /mnt/my_drive/my_repo This command only updated our configuration, but did not make any changes at the remote path (which may not exist). To make a copy of our repository, run yarsync clone new-replica-name host:/mnt/my_drive/my_repo **clone** copies all repository data (except configuration files) to a new replica with the given name and adds the new repository to remotes. To check that we set up the repositories correctly, make a dry run with \'**-n**\': yarsync push -n new-replica-name If there were no errors and no file transfers, then we have a functioning remote. We can continue working locally, adding and removing files and making commits. When we want to synchronize repositories, we **push** the changes *to* or **pull** them *from* a remote (first with a **\--dry-run**). This is the recommended workflow, and if we work on different repositories in sequence and always synchronize changes, our life will be easy. Sometimes, however, we may forget to synchronize two replicas and they will end up in a diverged state; we may actually change some files or find them corrupt. Solutions to these problems involve user decisions and are described in **pull** and **push** options. # OPTION SUMMARY | | | |--------------------|-------------------------------------------------------| | \--help, -h | show help message and exit | \--config-dir=DIR | path to the configuration directory | \--root-dir=DIR | path to the root of the working directory | \--quiet, -q | decrease verbosity | \--verbose, -v | increase verbosity | \--version, -V | print version # COMMAND SUMMARY | | | |--------------|-------------------------------------------------------------| | | | **checkout** | restore the working directory to a commit | **clone** | clone a repository | **commit** | commit the working directory | **diff** | print the difference between two commits | **init** | initialize a repository | **log** | print commit logs | **pull** | get data from a source | **push** | send data to a destination | **remote** | manage remote repositories | **show** | print log messages and actual changes for commit(s) | **status** | print updates since last commit # OPTIONS **\--help**, **-h** : Prints help message and exits. Default if no arguments are given. After a command name, prints help for that command. **\--config-dir=DIR** : Provides the path to the configuration directory if it is detached. Both **\--config-dir** and **\--root-dir** support tilde expansion for user's home directory. See SPECIAL REPOSITORIES for usage details. **\--root-dir=DIR** : Provides the path to the root of the working directory for a detached repository. Requires **\--config-dir**. If not set explicitly, the default working directory is the current one. **\--quiet**, **-q** : Decreases verbosity. Does not affect error messages (redirect them if needed). **\--verbose**, **-v** : Increases verbosity. May print more rsync commands and output. Conflicts with **\--quiet**. **\--version**, **-V** : Prints the **yarsync** version and exits. If **\--help** is given, it takes precedence over **\--version**. # COMMANDS All commands support the **\--help** option. Commands that can change a repository also support the **\--dry-run** option. **\--dry-run**, **-n** : Prints what will be transferred during a real run, but does not make any changes. **\--help**, **-h** : Prints help for a command or a subcommand. # checkout **yarsync checkout** \[**-h**] \[**-n**] *commit* Restores the working directory to its state during *commit*. WARNING: this will overwrite the working directory. Make sure that all important data is committed. Make a dry run first with **-n**. If not the most recent commit was checked out, the repository HEAD (in git terminology, see **git-checkout**(1)) becomes detached, which prevents such operations as **pull** or **push**. To advance the repository to its correct state, check out the last commit or make a new one. *commit* : The commit name (as printed in **log** or during **commit**). # clone **yarsync clone** \[**-h**] *name* *path|parent-path* One can clone from within an existing repository **to** *parent-path* or clone **from** a repository at *path*. In both cases a new directory with the repository is created, having the same name as the original repository folder. If that directory already exists, **clone** will fail (several safety checks are being made). The local repository (origin or clone) will add another one as a remote. Note that only data (working directory, commits, logs and synchronization information, not configuration files) will be cloned. This command will refuse to clone **from** a repository with a filter (see SPECIAL REPOSITORIES). *parent-path* is useful when we want to clone several repositories into one directory. It allows us to use the same command for each of them (manually or with **mr**(1)). If one needs to have a different directory name for a repository, they can rename it manually (we don't require, but strongly encourage having same directory names for all replicas). ### Positional arguments *name* : Name of the new repository. *path* : Path to the source repository (local or remote). Trailing slash is ignored. *parent-path* : Path to the parent directory of the cloned repository (local or remote). Trailing slash is ignored. # commit **yarsync commit** \[**-h**] \[**-m** *message*] \[**--limit** *number*] Commits the working directory (makes its snapshot). See QUICK START for more details on commits. **\--limit**=*number* : Maximum number of commits. If the current number of commits exceeds that, older ones are removed during **commit**. See SPECIAL REPOSITORIES for more details. *message* : Commit message (used in logs). Can be empty. # diff **yarsync diff** \[**-h**] *commit* \[*commit*] Prints the difference between two commits (from old to the new one, the order of arguments is unimportant). If the second commit is omitted, compares *commit* to the most recent one. See **status** for the output format. *commit* : Commit name. # init **yarsync init** \[**-h**] \[*reponame*] Initializes a **yarsync** repository in the current directory. Creates a configuration folder with repository files. Existing configuration and files in the working directory stay unchanged. Create a first commit for the repository to become fully operational. *reponame* : Name of the repository. If not provided on the command line, it will be prompted. # log **yarsync log** [**-h**] \[**-n** *number*] \[**-r**] Prints commit logs (from newest to oldest), as well as synchronization information when it is available. To see changes in the working directory, use **status**. ### Options **\--max-count**=*number*, **-n** : Maximum number of logs shown. **\--reverse**, **-r** : Reverse log order. ### Example To print information about the three most recent commits, use yarsync log -n 3 # pull **yarsync pull** \[**-h**] \[**-f** | **\--new** | **-b** | **\--backup-dir** *DIR*] [**-n**] *source* Gets data from a remote *source*. The difference between **pull** and **push** is mostly only the direction of transfer. **pull** and **push** bring two repositories into the same state. They synchronize the working directory, that is they add to the destination new files from source, remove those missing on source and do all renames and moves of previously committed files efficiently. This is done in one run, and these changes apply also to logs, commits and synchronization. In most cases, we do not want our existing logs and commits to be removed though. By default, several checks are made to prevent data loss: - local has no uncommitted changes, - local has not a detached HEAD, - local is not in a merging state, - destination has no commits missing on source. If any of these cases is in effect, no modifications will be made. Note that the remote may have uncommitted changes itself: always make a dry run with **-n** first! To commit local changes to the repository, use **commit**. HEAD commit could be changed during **checkout** (see its section for the solutions). If the destination has commits missing on source, there are two options: to **\--force** changes to the destination (removing these commits) or to merge changes inside the local repository with **pull \--new**. If we pull new commits from the remote, this will bring repository into a merging state. Merge will be done automatically if the last remote commit is among local ones (in that case only some older commits were transferred from there). If some recent remote commits are not present locally, however, this means that histories of the repositories diverged, and we will need to merge them manually. After we have all local and remote commits and the union of the working directories in our local repository, we can safely choose the easiest way for us to merge them. To see the changes, use **status** and **log**. For example, if we added a file in a *remote_commit* before and it was added now, we can just **commit** the changes. If we have made many local changes, renames and removals since then, we may better **checkout** our latest commit (remember that all files from the working directory are present in commits, so it is always safe) and link the new file to the working directory: ln .ys/commits//path/to/file . (it can be moved to its subdirectory without the risk of breaking hard links). If the remote commit was actually large, and local changes were recent but small, then we shall check out the remote commit and apply local changes by hand. After our working directory is in the desired state, we **commit** changes and the merge is finished. The result shall be pushed to the remote without problems. ### pull options **\--new** : Do not remove local data that is missing on *source*. While this option can return deleted or moved files back to the working directory, it also adds remote logs and commits that were missing here (for example, old or unsynchronized commits). A forced **push** to the remote could remove these logs and commits, and this option allows one to first **pull** them to the local repository. After **pull \--new** the local repository can enter a merging state. See **pull** description for more details. **\--backup**, **-b** : Changed files in the working directory are renamed (appended with \'**~**\'). See **\--backup-dir** for more details. **\--backup-dir** *DIR* : Changed local files are put into a directory *DIR* preserving their relative paths. *DIR* can be an absolute path or relative to the root of the repository. In contrast to **\--backup**, **\--backup-dir** does not change resulting file names. This option is convenient for large file trees, because it recreates the existing file structure of the repository (one doesn't have to search for new backup files in all subdirectories). For current rsync version, the command yarsync pull --backup-dir BACKUP will copy updated files from the remote and put them into the directory \"BACKUP/BACKUP\" (this is how rsync works). To reduce confusion, make standard **pull** first (so that during the backup there are only file updates). This option is available only for **pull**, because it is assumed that the user will apply local file changes after backup. For example, suppose that after a **pull \--backup** one gets files *a* and *a~* in the working directory. One should first see, which version is correct. If it is the local file *a~*, then the backup can be removed: mv a~ a By local we mean the one hard linked with local commits (run *ls -i* to be sure). If the remote version is correct though, you need first to overwrite the local version not breaking the hard links. This can be done with an rsync option \"\--inplace\": rsync --inplace a a~ mv a~ a # check file contents and the links ls -i a .ys/commits/*/a For a **\--backup-dir** and for longer paths these commands will be longer. Finally, if you need several versions, just save one of the files under a different name in the repository. After you have fixed all corrupt files, push them back to the remote. ### pull and push options **\--force**, **-f** : Updates the working directory, removing commits and logs missing on source. This command brings two repositories to the nearest possible states: their working directories, commits and logs become the same. While working directories are always identical after **pull** or **push** (except for some of the **pull** options), **yarsync** generally refuses to remove existing commits or logs \- unless this option is given. Use it if the destination has really unneeded commits or just remove them manually (see FILES for details on the commit directory). See also **pull \--new** on how to fetch missing commits. # push **yarsync push** \[**-h**] \[**-f**] \[**-n**] *destination* Sends data to a remote *destination*. See **pull** for more details and common options. # remote **yarsync remote** \[**-h**] \[**-v**] \[*command*] Manages remote repositories configuration. By default, prints existing remotes. For more options, see *.ys/config.ini* in the FILES section. **-v** : Verbose. Prints remote paths as well. ### **add** **yarsync remote add** \[**-h**] *repository* *path* Adds a new remote. *repository* is the name of the remote in local **yarsync** configuration (as it will be used later during **pull** or **push**). *path* has a standard form \[user@]host:\[path] for an actually remote host or it can be a local path. Since **yarsync** commands can be called from any subdirectory, local path should be absolute. Tilde for user's home directory \'**~**\' in paths is allowed. ### rm **yarsync remote rm** \[**-h**\] *repository* Removes an existing *repository* from local configuration. ### show Prints remote repositories. Default. # show **yarsync show** \[**-h**] *commit* \[*commit* ...\] Prints log messages and actual changes for commit(s). Changes are shown compared to the commit before *commit*. For the output format, see **status**. Information for several commits can be requested as well. *commit* : Commit name. # status **yarsync status** \[**-h**] Prints working directory updates since the last commit and the repository status. If there were no errors, this command always returns success (irrespective of uncommitted changes). ### Output format of the updates The output for the updates is a list of changes, including attribute changes, and is based on the format of *rsync \--itemize-changes*. For example, a line .d..t...... programming/ means that the modification time \'*t*\' of the directory \'*d*\' *programming/* in the root of the repository has changed (files were added or removed from that). All its other attributes are unchanged (\'.\'). The output is an 11-letter string of the format \"YXcstpoguax\", where \'Y\' is the update type, \'X\' is the file type, and the other letters represent attributes that are printed if they were changed. For a newly created file these would be \'+\', like >f+++++++++ /path/to/file The attribute letters are: **c**hecksum, **s**ize, modification **t**ime, **p**ermissions, **o**wner and **g**roup. **u** can be in fact **u**se (access) or creatio**n** time, or **b**oth. **a** stands for ACL, and **x** for extended attributes. Complete details on the output format can be found in the **rsync**(1) manual. # SPECIAL REPOSITORIES A **detached** repository is one with the **yarsync** configuration directory outside the working directory. To use such repository, one must provide **yarsync** options **\--config-dir** and **\--root-dir** with every command (**alias**(1p) may be of help). To create a detached repository, use **init** with these options or move the existing configuration directory manually. For example, if one wants to have several versions of static Web pages, they may create a detached repository and publish the working directory without the Web server having access to the configuration. Alternatively, if one really wants to have both a continuous synchronization and **yarsync** backups, they can move its configuration outside, if that will work. Commits in such repositories can be created or checked out, but **pull** or **push** are currently not supported (one will have to synchronize them manually). A detached repository is similar to a bare repository in git, but usually has a working directory. A repository with a **filter** can exclude (disable tracking) some files or directories from the working directory. This may be convenient, but makes synchronization less reliable, and such repository can not be used as a remote. See **rsync-filter** in the FILES section for more details. A repository can have a **commit limit**. The maximum number of commits can be set during **commit**. **pull** and **push** do not check for missing commits on the destination when we are in a repository with commit limit. It makes a repository with commit limit more like a central repository. If we have reached the maximum number of commits, older ones are deleted during a new **commit**. Commit limit is stored in **.ys/COMMIT_LIMIT.txt**. It can be changed or removed at any time. Commit limit was introduced in ``yarsync v0.2`` and was designed to help against the problem of too many hard links (if it exists). # FILES All **yarsync** repository configuration and data is stored in the hidden directory **.ys** under the root of the working directory. If the user no longer wants to use **yarsync** and the working directory is in the desired state, they can safely remove the **.ys** directory. Apart from the working directory, only commits, logs and synchronization data are synchronized between the repositories. Each repository has its own configuration and name. ## User configuration files **.ys/config.ini** : Contains names and paths of remote repositories. This file can be edited directly or with **remote** commands according to user's preference. **yarsync** supports synchronization only with existing remotes. A simple configuration for a remote \"my\_remote\" could be: [my_remote] path = remote:/path/on/my/remote Several sections can be added for more remotes. An example (non-effective) configuration is created during **init**. Note that comments in **config.ini** can be erased during **remote** {**add**,**rm**}. Since removable media or remote hosts can change their paths or IP addresses, one may use variable substitution in paths: [my_drive] path = $MY_DRIVE/my_repo For the substitutions to take the effect, export these variables before run: $ export MY_DRIVE=/run/media/my_drive $ yarsync push -n my_drive If we made a mistake in the variable or path, it will be shown in the printed command. Always use **\--dry-run** first to ensure proper synchronization. Another **yarsync** remote configuration option is **host**. If both **path** and **host** are present, the effective path will be their concatenation \"\:\\". Empty **host** means local host and does not prepend the path. It is possible to set default **host** for each section from the section name. For that, add a default section with an option **host_from_section_name**: [DEFAULT] host_from_section_name Empty lines and lines starting with \'**#**\' are ignored. Section names are case-sensitive. White spaces in a section name will be considered parts of its name. Spaces around \'**=**\' are allowed. Full syntax specification can be found at . **.ys/repo_\.txt** : Contains the repository name, which is used in logs and usually should coincide with the remote name (how local repository is called on remotes). The name can be set during **init** or edited manually. Each repository replica must have a unique name. For example, if one has repositories \"programming/\" and \"music/\" on a laptop \"my\_host\", their names would probably be \"my\_host\", and the names of their copies on an external drive could be \"my\_drive\" (this is different from git, which uses only the author's name in logs). Note that **clone** from inside a repository for technical reasons creates a temporary file with the new repository name (which is also written in **CLONE_TO_\.txt**). If these files due to some errors remain on the system, they can be safely removed. **.ys/rsync-filter** : Contains rsync filter rules, which effectively define what data belongs to the repository. The **rsync-filter** does not exist by default, but can be added for flexibility. For example, the author has a repository \"~/work\", but wants to keep his presentations in \"tex/\" in a separate repository. Instead of having a different directory \"~/work\_tex\", he adds such rules to **rsync-filter**: # all are in git repositories - /repos # take care to sync separately - /tex In this way, \"~/work/tex\" and contained git repositories will be excluded from \"~/work\" synchronization. Lines starting with \'**#**\' are ignored, as well as empty lines. To complicate things, one could include a subdirectory of \"tex\" into \"work\" with an include filter \'**+**\'. For complete details, see FILTER RULES section of **rsync**(1). While convenient for everyday use, filters make backup more difficult. To synchronize a repository with them, one has to remember that it has subdirectories that need to be synchronized too. If the remote repository had its own filters, that would make synchronization even more unreliable. Therefore filters are generally discouraged: **pull** and **push** ignore remote filters (make sure you synchronize only *from* a repository with filters), while **clone** refuses to copy from a repository with **rsync-filter**. ## yarsync technical directories **.ys/commits/** : Contains local commits (snapshots of the working directory). If some of the old commits are no longer needed (there are too many of them or they contain a large file), they can be removed. Make sure, however, that all remote repositories contain at least some of the present commits, otherwise future synchronization will get complicated. Alternatively, remove unneeded files or folders manually: commits can be edited, with care taken to synchronize them correctly. **.ys/logs/** : Contains text logs produced during **commit**. They are not necessary, so removing any of them will not break the repository. If one wants to fix or improve a commit message though, they may edit the corresponding log (the change will be propagated during **push**). It is recommended to store logs even for old deleted commits, which may be present on formerly used devices. **.ys/sync/** : Contains synchronization information for all known reposotories. This information is transferred between replicas during ``pull``, ``push`` and ``clone``, and it allows ``yarsync`` repositories to better support the 3-2-1 backup rule. The information is contained in empty files with names of the format **commit_repo.txt**. Pulling (or cloning) from a repository does not affect its files and does not update its synchronization information. **push** (and corresponding **clone**) updates synchronization for both replicas. For each repository only the most recent commit is stored. **sync** directory was introduced in ``yarsync v0.2``. See the release notes on how to convert old repositories to the new format or do it manually, if necessary. If a replica has been permanently removed, its synchronization data must be removed manually and propagated with **\--force**. # EXIT STATUS **0** : Success **1** : Invalid option **7** : Configuration error **8** : Command error **9** : System error **2-6**,**10-14**,**20-25**,**30**,**35** : rsync error If the command could be run successfully, a zero code is returned. Invalid option code is returned for mistakes in command line argument syntax. Configuration error can occur when we are outside an existing repository or a **yarsync** configuration file is missing. If the repository is correct, but the command is not allowed in its current state (for example, one can not push or pull when there are uncommitted changes or add a remote with an already present name), the command error is returned. It is also possible that a general system error, such as a keyboard interrupt, is raised in the Python interpreter. See **rsync**(1) for rsync errors. # DIAGNOSTICS To check that your clocks (used for properly ordering commits) at different hosts are synchronized well enough, run python -c 'import time; print(time.time())' To make sure that the local repository supports hard links instead of creating file copies, test it with du -sh . du -sh .ys (can be run during **pull** or **clone** if they take too long). The results must be almost the same. If not, you may not use **yarsync** on this file system, have large deleted files stored in old commits or you may have subdirectories excluded with a **filter** (see SPECIAL REPOSITORIES section). To test that a particular file \"a\" was hard linked to its committed versions, run ls -i a .ys/commits/*/a If all is correct, their inodes must be the same. Hard links can be broken in a cloned git repository (as it could happen with **yarsync** tests before), because git does not preserve them. To fix hard links for the whole repository, run **hardlink**(1) in its root. # SEE ALSO **rsync**(1) The yarsync page is . # BUGS Requires a filesystem with hard links, rsync version at least 3.1.0 (released 28 September 2013) and Python >= 3.6. Always do a **\--dry-run** before actual changes. Occasionally Python errors are raised instead of correct return codes. Please report any bugs or make feature requests to . # COPYRIGHT Copyright © 2021-2025 Yaroslav Nikitenko. License GPLv3: GNU GPL version 3 .\ This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. yarsync-0.3.1/docs/yarsync.1000066400000000000000000000755371477154272500157440ustar00rootroot00000000000000'\" t .\" Automatically generated by Pandoc 3.1.12.1 .\" .TH "YARSYNC" "1" "March 2025" "yarsync 0.3" "YARsync Manual" .SH NAME yarsync \- a file synchronization and backup tool .SH SYNOPSIS \f[B]yarsync\f[R] [\f[B]\-h\f[R]] [\f[B]\-\-config\-dir\f[R] \f[I]DIR\f[R]] [\f[B]\-\-root\-dir\f[R] \f[I]DIR\f[R]] [\f[B]\-q\f[R] | \f[B]\-v\f[R]] \f[I]command\f[R] [\f[I]args\f[R]] .SH DESCRIPTION Yet Another Rsync stores rsync configuration and synchronizes repositories with the interface similar to git. It is \f[I]efficient\f[R] (files in the repository can be removed and renamed freely without additional transfers), \f[I]distributed\f[R] (several replicas of the repository can diverge, and in that case a manual merge is supported), \f[I]safe\f[R] (it takes care to prevent data loss and corruption) and \f[I]simple\f[R] (see this manual). .SH QUICK START To create a new repository, enter the directory with its files and type .IP .EX yarsync init .EE .PP This operation is safe and will not affect existing files (including configuration files in an existing repository). Alternatively, run \f[B]init\f[R] inside an empty directory and add files afterward. To complete the initialization, make a commit: .IP .EX yarsync commit \-m \[dq]Initial commit\[dq] .EE .PP \f[B]commit\f[R] creates a snapshot of the working directory, which is all files in the repository except \f[B]yarsync\f[R] configuration and data. This snapshot is very small, because it uses hard links. To check how much your directory size has changed, run \f[B]du\f[R](1). .PP Commit name is the number of seconds since the Epoch (integer Unix time). This allows commits to be ordered in time, even for hosts in different zones. Though this works on most Unix systems and Windows, the epoch is platform dependent. .PP After creating a commit, files can be renamed, deleted or added. To see what was changed since the last commit, use \f[B]status\f[R]. To see the history of existing commits, use \f[B]log\f[R]. .PP Hard links are excellent at tracking file moves or renames and storing accidentally removed files. Their downside is that if a file gets corrupt, this will apply to all of its copies in local commits. The 3\-2\-1 backup rule requires to have at least 3 copies of data, so let us add a remote repository \[dq]my_remote\[dq]: .IP .EX yarsync remote add my_remote remote:/path/on/my/remote .EE .PP For local copies we still call the repositories \[dq]remote\[dq], but their paths would be local: .IP .EX yarsync remote add my_drive /mnt/my_drive/my_repo .EE .PP This command only updated our configuration, but did not make any changes at the remote path (which may not exist). To make a copy of our repository, run .IP .EX yarsync clone new\-replica\-name host:/mnt/my_drive/my_repo .EE .PP \f[B]clone\f[R] copies all repository data (except configuration files) to a new replica with the given name and adds the new repository to remotes. .PP To check that we set up the repositories correctly, make a dry run with \[aq]\f[B]\-n\f[R]\[aq]: .IP .EX yarsync push \-n new\-replica\-name .EE .PP If there were no errors and no file transfers, then we have a functioning remote. We can continue working locally, adding and removing files and making commits. When we want to synchronize repositories, we \f[B]push\f[R] the changes \f[I]to\f[R] or \f[B]pull\f[R] them \f[I]from\f[R] a remote (first with a \f[B]\-\-dry\-run\f[R]). This is the recommended workflow, and if we work on different repositories in sequence and always synchronize changes, our life will be easy. Sometimes, however, we may forget to synchronize two replicas and they will end up in a diverged state; we may actually change some files or find them corrupt. Solutions to these problems involve user decisions and are described in \f[B]pull\f[R] and \f[B]push\f[R] options. .SH OPTION SUMMARY .PP .TS tab(@); lw(18.7n) lw(51.3n). T{ \-\-help, \-h T}@T{ show help message and exit T} T{ \-\-config\-dir=DIR T}@T{ path to the configuration directory T} T{ \-\-root\-dir=DIR T}@T{ path to the root of the working directory T} T{ \-\-quiet, \-q T}@T{ decrease verbosity T} T{ \-\-verbose, \-v T}@T{ increase verbosity T} T{ \-\-version, \-V T}@T{ print version T} .TE .SH COMMAND SUMMARY .PP .TS tab(@); lw(13.1n) lw(56.9n). T{ T}@T{ T} T{ \f[B]checkout\f[R] T}@T{ restore the working directory to a commit T} T{ \f[B]clone\f[R] T}@T{ clone a repository T} T{ \f[B]commit\f[R] T}@T{ commit the working directory T} T{ \f[B]diff\f[R] T}@T{ print the difference between two commits T} T{ \f[B]init\f[R] T}@T{ initialize a repository T} T{ \f[B]log\f[R] T}@T{ print commit logs T} T{ \f[B]pull\f[R] T}@T{ get data from a source T} T{ \f[B]push\f[R] T}@T{ send data to a destination T} T{ \f[B]remote\f[R] T}@T{ manage remote repositories T} T{ \f[B]show\f[R] T}@T{ print log messages and actual changes for commit(s) T} T{ \f[B]status\f[R] T}@T{ print updates since last commit T} .TE .SH OPTIONS .TP \f[B]\-\-help\f[R], \f[B]\-h\f[R] Prints help message and exits. Default if no arguments are given. After a command name, prints help for that command. .TP \f[B]\-\-config\-dir=DIR\f[R] Provides the path to the configuration directory if it is detached. Both \f[B]\-\-config\-dir\f[R] and \f[B]\-\-root\-dir\f[R] support tilde expansion for user\[cq]s home directory. See SPECIAL REPOSITORIES for usage details. .TP \f[B]\-\-root\-dir=DIR\f[R] Provides the path to the root of the working directory for a detached repository. Requires \f[B]\-\-config\-dir\f[R]. If not set explicitly, the default working directory is the current one. .TP \f[B]\-\-quiet\f[R], \f[B]\-q\f[R] Decreases verbosity. Does not affect error messages (redirect them if needed). .TP \f[B]\-\-verbose\f[R], \f[B]\-v\f[R] Increases verbosity. May print more rsync commands and output. Conflicts with \f[B]\-\-quiet\f[R]. .TP \f[B]\-\-version\f[R], \f[B]\-V\f[R] Prints the \f[B]yarsync\f[R] version and exits. If \f[B]\-\-help\f[R] is given, it takes precedence over \f[B]\-\-version\f[R]. .SH COMMANDS All commands support the \f[B]\-\-help\f[R] option. Commands that can change a repository also support the \f[B]\-\-dry\-run\f[R] option. .TP \f[B]\-\-dry\-run\f[R], \f[B]\-n\f[R] Prints what will be transferred during a real run, but does not make any changes. .TP \f[B]\-\-help\f[R], \f[B]\-h\f[R] Prints help for a command or a subcommand. .SH checkout \f[B]yarsync checkout\f[R] [\f[B]\-h\f[R]] [\f[B]\-n\f[R]] \f[I]commit\f[R] .PP Restores the working directory to its state during \f[I]commit\f[R]. WARNING: this will overwrite the working directory. Make sure that all important data is committed. Make a dry run first with \f[B]\-n\f[R]. .PP If not the most recent commit was checked out, the repository HEAD (in git terminology, see \f[B]git\-checkout\f[R](1)) becomes detached, which prevents such operations as \f[B]pull\f[R] or \f[B]push\f[R]. To advance the repository to its correct state, check out the last commit or make a new one. .TP \f[I]commit\f[R] The commit name (as printed in \f[B]log\f[R] or during \f[B]commit\f[R]). .SH clone \f[B]yarsync clone\f[R] [\f[B]\-h\f[R]] \f[I]name\f[R] \f[I]path|parent\-path\f[R] .PP One can clone from within an existing repository \f[B]to\f[R] \f[I]parent\-path\f[R] or clone \f[B]from\f[R] a repository at \f[I]path\f[R]. In both cases a new directory with the repository is created, having the same name as the original repository folder. If that directory already exists, \f[B]clone\f[R] will fail (several safety checks are being made). The local repository (origin or clone) will add another one as a remote. .PP Note that only data (working directory, commits, logs and synchronization information, not configuration files) will be cloned. This command will refuse to clone \f[B]from\f[R] a repository with a filter (see SPECIAL REPOSITORIES). .PP \f[I]parent\-path\f[R] is useful when we want to clone several repositories into one directory. It allows us to use the same command for each of them (manually or with \f[B]mr\f[R](1)). If one needs to have a different directory name for a repository, they can rename it manually (we don\[cq]t require, but strongly encourage having same directory names for all replicas). .SS Positional arguments .TP \f[I]name\f[R] Name of the new repository. .TP \f[I]path\f[R] Path to the source repository (local or remote). Trailing slash is ignored. .TP \f[I]parent\-path\f[R] Path to the parent directory of the cloned repository (local or remote). Trailing slash is ignored. .SH commit \f[B]yarsync commit\f[R] [\f[B]\-h\f[R]] [\f[B]\-m\f[R] \f[I]message\f[R]] [\f[B]\[en]limit\f[R] \f[I]number\f[R]] .PP Commits the working directory (makes its snapshot). See QUICK START for more details on commits. .TP \f[B]\-\-limit\f[R]=\f[I]number\f[R] Maximum number of commits. If the current number of commits exceeds that, older ones are removed during \f[B]commit\f[R]. See SPECIAL REPOSITORIES for more details. .TP \f[I]message\f[R] Commit message (used in logs). Can be empty. .SH diff \f[B]yarsync diff\f[R] [\f[B]\-h\f[R]] \f[I]commit\f[R] [\f[I]commit\f[R]] .PP Prints the difference between two commits (from old to the new one, the order of arguments is unimportant). If the second commit is omitted, compares \f[I]commit\f[R] to the most recent one. See \f[B]status\f[R] for the output format. .TP \f[I]commit\f[R] Commit name. .SH init \f[B]yarsync init\f[R] [\f[B]\-h\f[R]] [\f[I]reponame\f[R]] .PP Initializes a \f[B]yarsync\f[R] repository in the current directory. Creates a configuration folder with repository files. Existing configuration and files in the working directory stay unchanged. Create a first commit for the repository to become fully operational. .TP \f[I]reponame\f[R] Name of the repository. If not provided on the command line, it will be prompted. .SH log \f[B]yarsync log\f[R] [\f[B]\-h\f[R]] [\f[B]\-n\f[R] \f[I]number\f[R]] [\f[B]\-r\f[R]] .PP Prints commit logs (from newest to oldest), as well as synchronization information when it is available. To see changes in the working directory, use \f[B]status\f[R]. .SS Options .TP \f[B]\-\-max\-count\f[R]=\f[I]number\f[R], \f[B]\-n\f[R] Maximum number of logs shown. .TP \f[B]\-\-reverse\f[R], \f[B]\-r\f[R] Reverse log order. .SS Example To print information about the three most recent commits, use .IP .EX yarsync log \-n 3 .EE .SH pull \f[B]yarsync pull\f[R] [\f[B]\-h\f[R]] [\f[B]\-f\f[R] | \f[B]\-\-new\f[R] | \f[B]\-b\f[R] | \f[B]\-\-backup\-dir\f[R] \f[I]DIR\f[R]] [\f[B]\-n\f[R]] \f[I]source\f[R] .PP Gets data from a remote \f[I]source\f[R]. The difference between \f[B]pull\f[R] and \f[B]push\f[R] is mostly only the direction of transfer. .PP \f[B]pull\f[R] and \f[B]push\f[R] bring two repositories into the same state. They synchronize the working directory, that is they add to the destination new files from source, remove those missing on source and do all renames and moves of previously committed files efficiently. This is done in one run, and these changes apply also to logs, commits and synchronization. In most cases, we do not want our existing logs and commits to be removed though. By default, several checks are made to prevent data loss: .IP .EX \- local has no uncommitted changes, \- local has not a detached HEAD, \- local is not in a merging state, \- destination has no commits missing on source. .EE .PP If any of these cases is in effect, no modifications will be made. Note that the remote may have uncommitted changes itself: always make a dry run with \f[B]\-n\f[R] first! .PP To commit local changes to the repository, use \f[B]commit\f[R]. HEAD commit could be changed during \f[B]checkout\f[R] (see its section for the solutions). If the destination has commits missing on source, there are two options: to \f[B]\-\-force\f[R] changes to the destination (removing these commits) or to merge changes inside the local repository with \f[B]pull \-\-new\f[R]. .PP If we pull new commits from the remote, this will bring repository into a merging state. Merge will be done automatically if the last remote commit is among local ones (in that case only some older commits were transferred from there). If some recent remote commits are not present locally, however, this means that histories of the repositories diverged, and we will need to merge them manually. After we have all local and remote commits and the union of the working directories in our local repository, we can safely choose the easiest way for us to merge them. To see the changes, use \f[B]status\f[R] and \f[B]log\f[R]. For example, if we added a file in a \f[I]remote_commit\f[R] before and it was added now, we can just \f[B]commit\f[R] the changes. If we have made many local changes, renames and removals since then, we may better \f[B]checkout\f[R] our latest commit (remember that all files from the working directory are present in commits, so it is always safe) and link the new file to the working directory: .IP .EX ln .ys/commits//path/to/file . .EE .PP (it can be moved to its subdirectory without the risk of breaking hard links). If the remote commit was actually large, and local changes were recent but small, then we shall check out the remote commit and apply local changes by hand. After our working directory is in the desired state, we \f[B]commit\f[R] changes and the merge is finished. The result shall be pushed to the remote without problems. .SS pull options .TP \f[B]\-\-new\f[R] Do not remove local data that is missing on \f[I]source\f[R]. While this option can return deleted or moved files back to the working directory, it also adds remote logs and commits that were missing here (for example, old or unsynchronized commits). A forced \f[B]push\f[R] to the remote could remove these logs and commits, and this option allows one to first \f[B]pull\f[R] them to the local repository. .RS .PP After \f[B]pull \-\-new\f[R] the local repository can enter a merging state. See \f[B]pull\f[R] description for more details. .RE .TP \f[B]\-\-backup\f[R], \f[B]\-b\f[R] Changed files in the working directory are renamed (appended with \[aq]\f[B]\[ti]\f[R]\[aq]). See \f[B]\-\-backup\-dir\f[R] for more details. .TP \f[B]\-\-backup\-dir\f[R] \f[I]DIR\f[R] Changed local files are put into a directory \f[I]DIR\f[R] preserving their relative paths. \f[I]DIR\f[R] can be an absolute path or relative to the root of the repository. In contrast to \f[B]\-\-backup\f[R], \f[B]\-\-backup\-dir\f[R] does not change resulting file names. .RS .PP This option is convenient for large file trees, because it recreates the existing file structure of the repository (one doesn\[cq]t have to search for new backup files in all subdirectories). For current rsync version, the command .IP .EX yarsync pull \-\-backup\-dir BACKUP .EE .PP will copy updated files from the remote and put them into the directory \[dq]BACKUP/BACKUP\[dq] (this is how rsync works). To reduce confusion, make standard \f[B]pull\f[R] first (so that during the backup there are only file updates). .PP This option is available only for \f[B]pull\f[R], because it is assumed that the user will apply local file changes after backup. For example, suppose that after a \f[B]pull \-\-backup\f[R] one gets files \f[I]a\f[R] and \f[I]a\[ti]\f[R] in the working directory. One should first see, which version is correct. If it is the local file \f[I]a\[ti]\f[R], then the backup can be removed: .IP .EX mv a\[ti] a .EE .PP By local we mean the one hard linked with local commits (run \f[I]ls \-i\f[R] to be sure). If the remote version is correct though, you need first to overwrite the local version not breaking the hard links. This can be done with an rsync option \[dq]\-\-inplace\[dq]: .IP .EX rsync \-\-inplace a a\[ti] mv a\[ti] a # check file contents and the links ls \-i a .ys/commits/*/a .EE .PP For a \f[B]\-\-backup\-dir\f[R] and for longer paths these commands will be longer. Finally, if you need several versions, just save one of the files under a different name in the repository. .PP After you have fixed all corrupt files, push them back to the remote. .RE .SS pull and push options .TP \f[B]\-\-force\f[R], \f[B]\-f\f[R] Updates the working directory, removing commits and logs missing on source. This command brings two repositories to the nearest possible states: their working directories, commits and logs become the same. While working directories are always identical after \f[B]pull\f[R] or \f[B]push\f[R] (except for some of the \f[B]pull\f[R] options), \f[B]yarsync\f[R] generally refuses to remove existing commits or logs \- unless this option is given. Use it if the destination has really unneeded commits or just remove them manually (see FILES for details on the commit directory). See also \f[B]pull \-\-new\f[R] on how to fetch missing commits. .SH push \f[B]yarsync push\f[R] [\f[B]\-h\f[R]] [\f[B]\-f\f[R]] [\f[B]\-n\f[R]] \f[I]destination\f[R] .PP Sends data to a remote \f[I]destination\f[R]. See \f[B]pull\f[R] for more details and common options. .SH remote \f[B]yarsync remote\f[R] [\f[B]\-h\f[R]] [\f[B]\-v\f[R]] [\f[I]command\f[R]] .PP Manages remote repositories configuration. By default, prints existing remotes. For more options, see \f[I].ys/config.ini\f[R] in the FILES section. .TP \f[B]\-v\f[R] Verbose. Prints remote paths as well. .SS \f[B]add\f[R] \f[B]yarsync remote add\f[R] [\f[B]\-h\f[R]] \f[I]repository\f[R] \f[I]path\f[R] .PP Adds a new remote. \f[I]repository\f[R] is the name of the remote in local \f[B]yarsync\f[R] configuration (as it will be used later during \f[B]pull\f[R] or \f[B]push\f[R]). \f[I]path\f[R] has a standard form [user\[at]]host:[path] for an actually remote host or it can be a local path. Since \f[B]yarsync\f[R] commands can be called from any subdirectory, local path should be absolute. Tilde for user\[cq]s home directory \[aq]\f[B]\[ti]\f[R]\[aq] in paths is allowed. .SS rm \f[B]yarsync remote rm\f[R] [\f[B]\-h\f[R]] \f[I]repository\f[R] .PP Removes an existing \f[I]repository\f[R] from local configuration. .SS show Prints remote repositories. Default. .SH show \f[B]yarsync show\f[R] [\f[B]\-h\f[R]] \f[I]commit\f[R] [\f[I]commit\f[R] \&...] .PP Prints log messages and actual changes for commit(s). Changes are shown compared to the commit before \f[I]commit\f[R]. For the output format, see \f[B]status\f[R]. Information for several commits can be requested as well. .TP \f[I]commit\f[R] Commit name. .SH status \f[B]yarsync status\f[R] [\f[B]\-h\f[R]] .PP Prints working directory updates since the last commit and the repository status. If there were no errors, this command always returns success (irrespective of uncommitted changes). .SS Output format of the updates The output for the updates is a list of changes, including attribute changes, and is based on the format of \f[I]rsync \-\-itemize\-changes\f[R]. For example, a line .IP .EX \&.d..t...... programming/ .EE .PP means that the modification time \[aq]\f[I]t\f[R]\[aq] of the directory \[aq]\f[I]d\f[R]\[aq] \f[I]programming/\f[R] in the root of the repository has changed (files were added or removed from that). All its other attributes are unchanged (\[aq].\[aq]). .PP The output is an 11\-letter string of the format \[dq]YXcstpoguax\[dq], where \[aq]Y\[aq] is the update type, \[aq]X\[aq] is the file type, and the other letters represent attributes that are printed if they were changed. For a newly created file these would be \[aq]+\[aq], like .IP .EX >f+++++++++ /path/to/file .EE .PP The attribute letters are: \f[B]c\f[R]hecksum, \f[B]s\f[R]ize, modification \f[B]t\f[R]ime, \f[B]p\f[R]ermissions, \f[B]o\f[R]wner and \f[B]g\f[R]roup. \f[B]u\f[R] can be in fact \f[B]u\f[R]se (access) or creatio\f[B]n\f[R] time, or \f[B]b\f[R]oth. \f[B]a\f[R] stands for ACL, and \f[B]x\f[R] for extended attributes. Complete details on the output format can be found in the \f[B]rsync\f[R](1) manual. .SH SPECIAL REPOSITORIES A \f[B]detached\f[R] repository is one with the \f[B]yarsync\f[R] configuration directory outside the working directory. To use such repository, one must provide \f[B]yarsync\f[R] options \f[B]\-\-config\-dir\f[R] and \f[B]\-\-root\-dir\f[R] with every command (\f[B]alias\f[R](1p) may be of help). To create a detached repository, use \f[B]init\f[R] with these options or move the existing configuration directory manually. For example, if one wants to have several versions of static Web pages, they may create a detached repository and publish the working directory without the Web server having access to the configuration. Alternatively, if one really wants to have both a continuous synchronization and \f[B]yarsync\f[R] backups, they can move its configuration outside, if that will work. Commits in such repositories can be created or checked out, but \f[B]pull\f[R] or \f[B]push\f[R] are currently not supported (one will have to synchronize them manually). A detached repository is similar to a bare repository in git, but usually has a working directory. .PP A repository with a \f[B]filter\f[R] can exclude (disable tracking) some files or directories from the working directory. This may be convenient, but makes synchronization less reliable, and such repository can not be used as a remote. See \f[B]rsync\-filter\f[R] in the FILES section for more details. .PP A repository can have a \f[B]commit limit\f[R]. The maximum number of commits can be set during \f[B]commit\f[R]. \f[B]pull\f[R] and \f[B]push\f[R] do not check for missing commits on the destination when we are in a repository with commit limit. It makes a repository with commit limit more like a central repository. If we have reached the maximum number of commits, older ones are deleted during a new \f[B]commit\f[R]. Commit limit is stored in \f[B].ys/COMMIT_LIMIT.txt\f[R]. It can be changed or removed at any time. Commit limit was introduced in \f[CR]yarsync v0.2\f[R] and was designed to help against the problem of too many hard links (if it exists). .SH FILES All \f[B]yarsync\f[R] repository configuration and data is stored in the hidden directory \f[B].ys\f[R] under the root of the working directory. If the user no longer wants to use \f[B]yarsync\f[R] and the working directory is in the desired state, they can safely remove the \f[B].ys\f[R] directory. .PP Apart from the working directory, only commits, logs and synchronization data are synchronized between the repositories. Each repository has its own configuration and name. .SH User configuration files .TP \f[B].ys/config.ini\f[R] Contains names and paths of remote repositories. This file can be edited directly or with \f[B]remote\f[R] commands according to user\[cq]s preference. .RS .PP \f[B]yarsync\f[R] supports synchronization only with existing remotes. A simple configuration for a remote \[dq]my_remote\[dq] could be: .IP .EX [my_remote] path = remote:/path/on/my/remote .EE .PP Several sections can be added for more remotes. An example (non\-effective) configuration is created during \f[B]init\f[R]. Note that comments in \f[B]config.ini\f[R] can be erased during \f[B]remote\f[R] {\f[B]add\f[R],\f[B]rm\f[R]}. .PP Since removable media or remote hosts can change their paths or IP addresses, one may use variable substitution in paths: .IP .EX [my_drive] path = $MY_DRIVE/my_repo .EE .PP For the substitutions to take the effect, export these variables before run: .IP .EX $ export MY_DRIVE=/run/media/my_drive $ yarsync push \-n my_drive .EE .PP If we made a mistake in the variable or path, it will be shown in the printed command. Always use \f[B]\-\-dry\-run\f[R] first to ensure proper synchronization. .PP Another \f[B]yarsync\f[R] remote configuration option is \f[B]host\f[R]. If both \f[B]path\f[R] and \f[B]host\f[R] are present, the effective path will be their concatenation \[dq]:\[dq]. Empty \f[B]host\f[R] means local host and does not prepend the path. .PP It is possible to set default \f[B]host\f[R] for each section from the section name. For that, add a default section with an option \f[B]host_from_section_name\f[R]: .IP .EX [DEFAULT] host_from_section_name .EE .PP Empty lines and lines starting with \[aq]\f[B]#\f[R]\[aq] are ignored. Section names are case\-sensitive. White spaces in a section name will be considered parts of its name. Spaces around \[aq]\f[B]=\f[R]\[aq] are allowed. Full syntax specification can be found at \c .UR https://docs.python.org/3/library/configparser.html .UE \c \&. .RE .TP \f[B].ys/repo_.txt\f[R] Contains the repository name, which is used in logs and usually should coincide with the remote name (how local repository is called on remotes). The name can be set during \f[B]init\f[R] or edited manually. .RS .PP Each repository replica must have a unique name. For example, if one has repositories \[dq]programming/\[dq] and \[dq]music/\[dq] on a laptop \[dq]my_host\[dq], their names would probably be \[dq]my_host\[dq], and the names of their copies on an external drive could be \[dq]my_drive\[dq] (this is different from git, which uses only the author\[cq]s name in logs). .PP Note that \f[B]clone\f[R] from inside a repository for technical reasons creates a temporary file with the new repository name (which is also written in \f[B]CLONE_TO_.txt\f[R]). If these files due to some errors remain on the system, they can be safely removed. .RE .TP \f[B].ys/rsync\-filter\f[R] Contains rsync filter rules, which effectively define what data belongs to the repository. The \f[B]rsync\-filter\f[R] does not exist by default, but can be added for flexibility. .RS .PP For example, the author has a repository \[dq]\[ti]/work\[dq], but wants to keep his presentations in \[dq]tex/\[dq] in a separate repository. Instead of having a different directory \[dq]\[ti]/work_tex\[dq], he adds such rules to \f[B]rsync\-filter\f[R]: .IP .EX # all are in git repositories \- /repos # take care to sync separately \- /tex .EE .PP In this way, \[dq]\[ti]/work/tex\[dq] and contained git repositories will be excluded from \[dq]\[ti]/work\[dq] synchronization. Lines starting with \[aq]\f[B]#\f[R]\[aq] are ignored, as well as empty lines. To complicate things, one could include a subdirectory of \[dq]tex\[dq] into \[dq]work\[dq] with an include filter \[aq]\f[B]+\f[R]\[aq]. For complete details, see FILTER RULES section of \f[B]rsync\f[R](1). .PP While convenient for everyday use, filters make backup more difficult. To synchronize a repository with them, one has to remember that it has subdirectories that need to be synchronized too. If the remote repository had its own filters, that would make synchronization even more unreliable. Therefore filters are generally discouraged: \f[B]pull\f[R] and \f[B]push\f[R] ignore remote filters (make sure you synchronize only \f[I]from\f[R] a repository with filters), while \f[B]clone\f[R] refuses to copy from a repository with \f[B]rsync\-filter\f[R]. .RE .SH yarsync technical directories .TP \f[B].ys/commits/\f[R] Contains local commits (snapshots of the working directory). If some of the old commits are no longer needed (there are too many of them or they contain a large file), they can be removed. Make sure, however, that all remote repositories contain at least some of the present commits, otherwise future synchronization will get complicated. Alternatively, remove unneeded files or folders manually: commits can be edited, with care taken to synchronize them correctly. .TP \f[B].ys/logs/\f[R] Contains text logs produced during \f[B]commit\f[R]. They are not necessary, so removing any of them will not break the repository. If one wants to fix or improve a commit message though, they may edit the corresponding log (the change will be propagated during \f[B]push\f[R]). It is recommended to store logs even for old deleted commits, which may be present on formerly used devices. .TP \f[B].ys/sync/\f[R] Contains synchronization information for all known reposotories. This information is transferred between replicas during \f[CR]pull\f[R], \f[CR]push\f[R] and \f[CR]clone\f[R], and it allows \f[CR]yarsync\f[R] repositories to better support the 3\-2\-1 backup rule. The information is contained in empty files with names of the format \f[B]commit_repo.txt\f[R]. Pulling (or cloning) from a repository does not affect its files and does not update its synchronization information. \f[B]push\f[R] (and corresponding \f[B]clone\f[R]) updates synchronization for both replicas. For each repository only the most recent commit is stored. \f[B]sync\f[R] directory was introduced in \f[CR]yarsync v0.2\f[R]. See the release notes on how to convert old repositories to the new format or do it manually, if necessary. .RS .PP If a replica has been permanently removed, its synchronization data must be removed manually and propagated with \f[B]\-\-force\f[R]. .RE .SH EXIT STATUS .TP \f[B]0\f[R] Success .TP \f[B]1\f[R] Invalid option .TP \f[B]7\f[R] Configuration error .TP \f[B]8\f[R] Command error .TP \f[B]9\f[R] System error .TP \f[B]2\-6\f[R],\f[B]10\-14\f[R],\f[B]20\-25\f[R],\f[B]30\f[R],\f[B]35\f[R] rsync error .PP If the command could be run successfully, a zero code is returned. Invalid option code is returned for mistakes in command line argument syntax. Configuration error can occur when we are outside an existing repository or a \f[B]yarsync\f[R] configuration file is missing. If the repository is correct, but the command is not allowed in its current state (for example, one can not push or pull when there are uncommitted changes or add a remote with an already present name), the command error is returned. It is also possible that a general system error, such as a keyboard interrupt, is raised in the Python interpreter. See \f[B]rsync\f[R](1) for rsync errors. .SH DIAGNOSTICS To check that your clocks (used for properly ordering commits) at different hosts are synchronized well enough, run .IP .EX python \-c \[aq]import time; print(time.time())\[aq] .EE .PP To make sure that the local repository supports hard links instead of creating file copies, test it with .IP .EX du \-sh . du \-sh .ys .EE .PP (can be run during \f[B]pull\f[R] or \f[B]clone\f[R] if they take too long). The results must be almost the same. If not, you may not use \f[B]yarsync\f[R] on this file system, have large deleted files stored in old commits or you may have subdirectories excluded with a \f[B]filter\f[R] (see SPECIAL REPOSITORIES section). .PP To test that a particular file \[dq]a\[dq] was hard linked to its committed versions, run .IP .EX ls \-i a .ys/commits/*/a .EE .PP If all is correct, their inodes must be the same. .PP Hard links can be broken in a cloned git repository (as it could happen with \f[B]yarsync\f[R] tests before), because git does not preserve them. To fix hard links for the whole repository, run \f[B]hardlink\f[R](1) in its root. .SH SEE ALSO \f[B]rsync\f[R](1) .PP The yarsync page is \c .UR https://github.com/ynikitenko/yarsync .UE \c \&. .SH BUGS Requires a filesystem with hard links, rsync version at least 3.1.0 (released 28 September 2013) and Python >= 3.6. .PP Always do a \f[B]\-\-dry\-run\f[R] before actual changes. Occasionally Python errors are raised instead of correct return codes. Please report any bugs or make feature requests to \c .UR https://github.com/ynikitenko/yarsync/issues .UE \c \&. .SH COPYRIGHT Copyright © 2021\-2025 Yaroslav Nikitenko. License GPLv3: GNU GPL version 3 \c .UR https://gnu.org/licenses/gpl.html .UE \c \&. .PD 0 .P .PD This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. .SH AUTHORS Written by Yaroslav Nikitenko. yarsync-0.3.1/pyproject.toml000066400000000000000000000061131477154272500161360ustar00rootroot00000000000000# ----- # # Build # # ----- # [build-system] requires = ["setuptools>=61.0.0"] # https://setuptools.pypa.io/en/latest/history.html#v61-0-0 # Since Python 3.6 is stuck with setuptools 59.6.0, it works only for python>=3.7 build-backend = "setuptools.build_meta" [tool.setuptools.packages.find] exclude = ["tests", "tests.*"] # This is the entry-point # https://setuptools.pypa.io/en/latest/userguide/entry_point.html [project.scripts] yarsync = "yarsync.yarsync:main" # -------- # # Metadata # # -------- # [project] name = "yarsync" authors=[{name="Yaroslav Nikitenko",email="metst13@gmail.com"}] description = "Yet Another Rsync is a file synchronization and backup tool" keywords=["distributed", "file", "synchronization", "rsync", "backup"] readme = "README.rst" requires-python = ">=3.6" license = {text = 'GPLv3'} # these are most recent formats for license fields, # but they are not supported by many distributions. Try again in 5 years. # license = "GPL-3.0" # license-files = ["LICENSE"] classifiers=[ "Development Status :: 5 - Production/Stable", "Environment :: Console", "Intended Audience :: End Users/Desktop", "Intended Audience :: System Administrators", "Operating System :: POSIX :: Linux", # remove in the future "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Programming Language :: Python :: 3 :: Only", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", "Programming Language :: Python :: 3.11", "Programming Language :: Python :: 3.12", "Programming Language :: Python :: 3.13", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy", "Topic :: System :: Archiving", "Topic :: System :: Archiving :: Backup", "Topic :: System :: Archiving :: Mirroring", "Topic :: Utilities", ] dynamic = ["version"] dependencies = [] [project.urls] Documentation = 'https://yarsync.readthedocs.io' Source = 'https://github.com/ynikitenko/yarsync' Tracker = 'https://github.com/ynikitenko/yarsync/issues' # https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html#dynamic-metadata [tool.setuptools.dynamic] version = {attr = "yarsync.version.__version__"} # -------------------------- # # Development / dependencies # # -------------------------- # [project.optional-dependencies] documentation = [ "furo", "myst-parser", "sphinx", ] [dependency-groups] dev = [ "pip", "coverage", "pytest", "pytest-cov", "pytest-mock", "tox", ] # ----------------- # # Tox configuration # # ----------------- # [tool.tox] requires = ["tox>=3.28.0"] # Python 3.6 was tested separately. env_list = ["3.7","3.8", "3.9", "3.10", "3.11", "3.12", "3.13", "pypy3"] [tool.tox.env_run_base] description = "Run test under {base_python}" commands = [["pytest", "{posargs}"]] # Activate the following line in case you wish to use tox with uv #runner = "uv-venv-lock-runner" yarsync-0.3.1/requirements.txt000066400000000000000000000000721477154272500165040ustar00rootroot00000000000000## development pytest coverage pytest-cov pytest-mock tox yarsync-0.3.1/setup.py000066400000000000000000000050011477154272500147270ustar00rootroot00000000000000# setup.py is outdated and shall be removed in the next release. import setuptools with open("README.rst", "r") as readme: long_description = readme.read() # from https://packaging.python.org/en/latest/guides/single-sourcing-package-version/ version = {} with open("yarsync/version.py") as fp: exec(fp.read(), version) setuptools.setup( name="yarsync", version=version['__version__'], author="Yaroslav Nikitenko", author_email="metst13@gmail.com", description="Yet Another Rsync is a file synchronization and backup tool", license="GPLv3", long_description=long_description, long_description_content_type="text/x-rst", url="https://github.com/ynikitenko/yarsync", project_urls = { 'Documentation': 'https://yarsync.readthedocs.io', 'Source': 'https://github.com/ynikitenko/yarsync', 'Tracker': 'https://github.com/ynikitenko/yarsync/issues', }, keywords="distributed, file, synchronization, rsync, backup", packages=setuptools.find_packages(exclude=['tests', 'tests.*']), classifiers=[ "Development Status :: 5 - Production/Stable", "Environment :: Console", "Intended Audience :: End Users/Desktop", "Intended Audience :: System Administrators", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Operating System :: POSIX :: Linux", "Programming Language :: Python :: 3 :: Only", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", "Programming Language :: Python :: 3.11", "Programming Language :: Python :: 3.12", "Programming Language :: Python :: 3.13", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy", "Topic :: System :: Archiving", "Topic :: System :: Archiving :: Backup", "Topic :: System :: Archiving :: Mirroring", "Topic :: Utilities", ], # briefly about entry points in Russian # https://npm.mipt.ru/youtrack/articles/GENERAL-A-87/Использование-setuptools-в-Python # original docs (also brief): # https://setuptools.pypa.io/en/latest/userguide/entry_point.html entry_points={ 'console_scripts': [ 'yarsync = yarsync.yarsync:main', ] }, python_requires='>=3.6', ) yarsync-0.3.1/tests/000077500000000000000000000000001477154272500143635ustar00rootroot00000000000000yarsync-0.3.1/tests/__init__.py000066400000000000000000000000001477154272500164620ustar00rootroot00000000000000yarsync-0.3.1/tests/conftest.py000066400000000000000000000075611477154272500165730ustar00rootroot00000000000000import os import pathlib import pytest from yarsync import YARsync from .helpers import clone_repo from .settings import TEST_DIR, TEST_DIR_READ_ONLY, TEST_DIR_YS_BAD_PERMISSIONS collect_ignore_glob = ["test_dir_*"] # Tips: # - all common teardowns in one place (here). Single local teardowns are possible. # - tear down before an actual test (save its results and test after having torn down). # - distinguish global (repo) use, common use (where you can get errors) and separate use. # - be prepared for unexpected. Rsync created a directory without permissions, and then failed. def fix_hardlinks(main_dir, dest_dirs): for fil in os.listdir(main_dir): # be careful that dest_dirs are a list, not an exhaustable iterator for dest_dir in dest_dirs: dest_path = dest_dir / fil if dest_path.exists(): # it is important that files were never renamed, # only unlinked (in the general sense). # Note that if there were two old commits with one file # (now deleted from the workdir), these won't be linked. if dest_path.is_file(): # not is_dir() dest_path.unlink() # print("link ", main_dir / fil, dest_path) os.link(main_dir / fil, dest_path) # there is also Path.hardlink_to, # but only available since version 3.10. else: fix_hardlinks(main_dir / fil, [dest_path]) def fix_ys_hardlinks(test_dir): # since we clone only TEST_DIR, it would be enough # to fix hard links there (they can get messed up by git). test_dir = pathlib.Path(test_dir) commit_dir = test_dir / ".ys" / "commits" fix_hardlinks(test_dir, list(commit_dir.iterdir())) @pytest.fixture(scope="session", autouse=True) def fix_test_dir(): fix_ys_hardlinks(TEST_DIR) @pytest.fixture(scope="session", autouse=True) def fix_test_dir_bad_permissions(): subdir_bad_perms = os.path.join(TEST_DIR_YS_BAD_PERMISSIONS, "forbidden") os.chmod(subdir_bad_perms, 0o000) # we tear down later, because otherwise pytest will have problems # with searching in that directory yield TEST_DIR_YS_BAD_PERMISSIONS os.chmod(subdir_bad_perms, 0o755) @pytest.fixture def test_dir(): os.chdir(TEST_DIR) @pytest.fixture(scope="session") def test_dir_common_copy(tmp_path_factory): # can't use the tmp_path fixture, because it has a function scope tmp_path = str(tmp_path_factory.mktemp("test_dir_copy")) clone_repo(TEST_DIR, tmp_path) os.chdir(tmp_path) return tmp_path # usage: # @pytest.mark.usefixtures("test_dir_separate_copy") @pytest.fixture def test_dir_separate_copy(tmp_path): clone_repo(TEST_DIR, str(tmp_path)) os.chdir(tmp_path) return tmp_path @pytest.fixture() def test_dir_ys_bad_permissions(): os.chdir(TEST_DIR_YS_BAD_PERMISSIONS) @pytest.fixture(scope="session") def origin_test_dir(origin_dir=TEST_DIR, test_dir=TEST_DIR_YS_BAD_PERMISSIONS): """Add a remote "origin" with a path to origin_dir to test_dir.""" os.chdir(test_dir) # disable stdout, or it will interfere with other tests. # There are open problems with capfd in pytest: # https://github.com/pytest-dev/pytest/issues/4428 ys_add = YARsync("yarsync -qq remote add origin ".split() + [origin_dir]) # this will fail (return 7) if the remote is already there, # but it doesn't affect the results. ys_add() yield origin_dir # remove remote "origin" # Enter test_dir again, because the current directory could change. os.chdir(test_dir) ys_rm = YARsync("yarsync remote rm origin".split()) assert not ys_rm() @pytest.fixture(scope="session") def test_dir_read_only(): os.chmod(TEST_DIR_READ_ONLY, 0o544) yield TEST_DIR_READ_ONLY # tear down os.chmod(TEST_DIR_READ_ONLY, 0o755) yarsync-0.3.1/tests/helpers.py000066400000000000000000000011731477154272500164010ustar00rootroot00000000000000import os import subprocess def clone_repo(from_path, to_path): if not from_path.endswith(os.path.sep): from_path += os.path.sep rsync_command = ( "rsync -avHP --delete-after " # --filter='merge .ys/rsync-filter' " # we also copy configuration files, # because we need a real repository # with config.ini, etc. # "--include=/.ys/commits --include=/.ys/logs --exclude=/.ys/* " + from_path + " " + to_path ) print(rsync_command) subprocess.check_call(rsync_command, shell=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL) yarsync-0.3.1/tests/settings.py000066400000000000000000000020511477154272500165730ustar00rootroot00000000000000import os # directory with commits and logs TEST_DIR = os.path.join(os.path.dirname(__file__), "test_dir") # same as TEST_DIR, but with an rsync-filter TEST_DIR_FILTER = os.path.join(os.path.dirname(__file__), "test_dir_filter") # directory without commits and logs, but with a .ys configuration TEST_DIR_EMPTY = os.path.join(os.path.dirname(__file__), "test_dir_empty") # directory with no files and no .ys directory TEST_DIR_READ_ONLY = os.path.join(os.path.dirname(__file__), "test_dir_read_only") # directory with a .ys repository, but with a forbidden subdirectory TEST_DIR_YS_BAD_PERMISSIONS = os.path.join(os.path.dirname(__file__), "test_dir_ys_bad_permissions") # content must be same as in TEST_DIR, but the YSDIR is detached. TEST_DIR_CONFIG_DIR = os.path.join(os.path.dirname(__file__), "test_dir_config_dir") TEST_DIR_WORK_DIR = os.path.join(os.path.dirname(__file__), "test_dir_work_dir") YSDIR = ".ys" yarsync-0.3.1/tests/test_clone.py000066400000000000000000000232631477154272500171020ustar00rootroot00000000000000import os import pytest from yarsync import YARsync from yarsync.yarsync import ( CONFIG_ERROR, COMMAND_ERROR ) from .helpers import clone_repo from .settings import TEST_DIR, TEST_DIR_FILTER, TEST_DIR_YS_BAD_PERMISSIONS clone_command = ["yarsync", "clone"] rsync_error = 23 def test_clone_from(tmp_path_factory, capfd): dest = tmp_path_factory.mktemp("test") os.chdir(str(dest)) # when we clone from, we don't affect the original repo ys = YARsync(clone_command + ["tmp", TEST_DIR]) returncode = ys() assert not returncode test_dir_name = 'test_dir' # os.path.basename(TEST_DIR) # test_dir was cloned into dest assert os.listdir(str(dest)) == [test_dir_name] # files from test_dir were transferred new_repo_dir = os.path.join(dest, test_dir_name) assert set(os.listdir(new_repo_dir)) == set(['b', 'a', 'c', '.ys']) # configuration files were transferred # clone name was used correctly new_ys_dir = os.path.join(new_repo_dir, ys.YSDIR) assert set(os.listdir(new_ys_dir)) == set([ 'sync', 'commits', 'repo_tmp.txt', 'logs', 'config.ini' ]) # we don't check every commit, because that is done in pull new_sync_dir = os.path.join(new_ys_dir, ys.SYNCDIRNAME) assert set(os.listdir(new_sync_dir)) == set([ '2_TEST.txt', '2_tmp.txt', '2_other_repo.txt' ]) # no errors were issued captured = capfd.readouterr() assert not captured.err ## Can't clone from a directory with filter dest3 = tmp_path_factory.mktemp("dest") os.chdir(dest3) ys3 = YARsync(["yarsync", "clone", "clone", TEST_DIR_FILTER]) return_code = ys3() assert return_code == CONFIG_ERROR @pytest.mark.parametrize( "test_dir_source", (TEST_DIR, TEST_DIR_FILTER) ) def test_clone_to(tmp_path_factory, test_dir_source): # todo: filter is probably irrelevant for this test # and doesn't work (where do we check for it?) source_dir = tmp_path_factory.mktemp("test") clone_dir = tmp_path_factory.mktemp("test") repo_name = os.path.basename(source_dir) # we copy the repository low-level to get rid of side effects # (synchronization, new remote, etc.) clone_repo(str(test_dir_source), str(source_dir)) os.chdir(str(source_dir)) ys = YARsync(clone_command + ["tmp", str(clone_dir)]) returncode = ys() assert not returncode # todo: capture output # if "-v" in clone_command: ... # all files were transferred new_repo = os.path.join(clone_dir, repo_name) # we compare sets, because the ordering will be different new_files = set(os.listdir(new_repo)) assert new_files.issubset(set(os.listdir(source_dir))) # they are not equal, because rsync-filter excludes 'b' assert 'a' in new_files def test_clone_from_errors_1(tmp_path_factory, capfd): invalid_source = tmp_path_factory.mktemp("invalid_source") dest1 = tmp_path_factory.mktemp("dest_for_cloning") ## Can't clone from an invalid repository os.chdir(dest1) ys1 = YARsync(clone_command + ["origin", str(invalid_source)]) # will be called clone_from, because we are outside. return_code = ys1._clone_from("origin", str(invalid_source)) assert return_code == COMMAND_ERROR err = capfd.readouterr().err assert 'No yarsync repository found at ' in err ## Can't clone into a repository with the same name os.chdir(dest1) ys_same_name = YARsync(clone_command + ["TEST", TEST_DIR]) returncode = ys_same_name() assert returncode == COMMAND_ERROR assert "Name 'TEST' is already used by the remote. " in capfd.readouterr().err ## Can't clone if the new directory already exists os.mkdir("test_dir") ys_exists = YARsync(clone_command + ["new_TEST", TEST_DIR]) returncode = ys_exists() assert returncode == COMMAND_ERROR assert "Directory 'test_dir' exists. Aborting" in capfd.readouterr().err def test_clone_from_errors_2(tmp_path_factory, capfd): ## Can't clone from a repository with bad permissions dest3 = str(tmp_path_factory.mktemp("dest_to_fail")) os.chdir(dest3) # slash added for variability (coverage) ys3 = YARsync(clone_command + ["clone", TEST_DIR_YS_BAD_PERMISSIONS + '/']) assert ys3() == rsync_error captured = capfd.readouterr() assert "An error occurred while pulling data from" in captured.err # even though the contents of the directory were not transferred, # it has been created. forbidden = os.path.join(dest3, "test_dir_ys_bad_permissions", "forbidden") os.chmod(forbidden, 0o777) @pytest.mark.usefixtures("test_dir_ys_bad_permissions") def test_clone_from_repo_with_bad_permissions(test_dir_common_copy, capfd): ## Can't clone from inside a repository with bad permissions ys = YARsync(clone_command + ["clone", test_dir_common_copy]) # this is a COMMAND_ERROR, because there are uncommitted changes assert ys._clone_to("clone", test_dir_common_copy) == COMMAND_ERROR captured = capfd.readouterr() assert "local repository has uncommitted changes. Exit." in captured.err @pytest.mark.usefixtures("test_dir_common_copy") def test_clone_to_errors_1(tmp_path_factory, capfd): ## Can't clone from here if we can't read the remote parent # create a directory with bad permissions dest_bad = str(tmp_path_factory.mktemp("bad_dest")) # from https://stackoverflow.com/a/25988623/952234 # cur_perm = stat.S_IMODE(os.lstat(dest_bad).st_mode) # os.chmod(dest_bad, cur_perm & ~stat.S_IREAD) os.chmod(dest_bad, 0o000) # do the clone ys_bad_parent = YARsync(clone_command + ["bad_parent", dest_bad]) returncode = ys_bad_parent() # tear down permissions os.chmod(dest_bad, 0o755) # os.chmod(dest_bad, cur_perm | stat.S_IREAD) assert returncode == COMMAND_ERROR captured = capfd.readouterr() assert "Parent folder of the clone could not be read. Aborting"\ in captured.err @pytest.mark.usefixtures("test_dir_separate_copy") def test_clone_to_errors_2(tmp_path_factory, capfd): ## Clone name must not exist in remotes dest4 = str(tmp_path_factory.mktemp("dest_for_errors")) ys4 = YARsync(clone_command + ["other_repo", dest4]) assert ys4() == COMMAND_ERROR assert "remote other_repo exists, break." in capfd.readouterr().err ## synchronization changes are updated normally sync_dir6 = os.path.join(".ys", "sync") sync60 = os.listdir(sync_dir6) assert set(sync60) == set(['2_other_repo.txt', '2_TEST.txt']) dest6 = str(tmp_path_factory.mktemp("new_dest")) ys6 = YARsync(clone_command + ["new_clone_2", dest6]) assert ys6() == 0 sync61 = os.listdir(sync_dir6) assert set(sync61) == set(['2_new_clone_2.txt', '2_other_repo.txt', '2_TEST.txt']) # manually fix it, otherwise pytest complains about garbage, # see https://github.com/pytest-dev/pytest/issues/7821 def test_clone_to_errors_3(tmp_path_factory, test_dir_separate_copy, capfd): # if we don't mark it, we have to cd explicitly os.chdir(test_dir_separate_copy) # print(os.getcwd()) ## synchronization changes are reverted in case of errors sync_dir5 = os.path.join(".ys", "sync") sync50 = set(os.listdir(sync_dir5)) assert sync50 == {'2_other_repo.txt', '2_TEST.txt'} dest5 = str(tmp_path_factory.mktemp("dest")) ys5 = YARsync(clone_command + ["new", dest5]) comm_1_dir = os.path.join(".ys", "commits", "1") repo_dir_name = os.path.basename(test_dir_separate_copy) comm_1_dir_copy = os.path.join(dest5, repo_dir_name, ".ys", "commits", "1") os.chmod(comm_1_dir, 0o000) returncode = ys5() os.chmod(comm_1_dir, 0o755) os.chmod(comm_1_dir_copy, 0o755) # command error, because the problem is with local repo # rsync error, because otherwise test fails assert returncode == rsync_error sync51 = set(os.listdir(sync_dir5)) assert sync51 == sync50 def test_clone_to_errors_parent(tmp_path_factory, capfd): ## Can't clone if the remote contains ## a folder with this directory name test_dir2 = tmp_path_factory.mktemp("test_dir2") clone_repo(str(TEST_DIR), str(test_dir2)) os.chdir(test_dir2) dest_exists = str(tmp_path_factory.mktemp("new_dest")) # create a directory with the name of local repository in dest os.mkdir(os.path.join(dest_exists, os.path.basename(test_dir2))) ys_dest_ex = YARsync(clone_command + ["new", dest_exists]) assert ys_dest_ex() == COMMAND_ERROR captured = capfd.readouterr() assert "Repository folder already exists at " + dest_exists in captured.err def test_clone_with_env_path(tmp_path_factory): ### Clone from path with an environmental variable ### preserves that variable in the configuration test_dir = tmp_path_factory.mktemp("new_test_dir") test_dir_name = os.path.basename(test_dir) clone_repo(str(TEST_DIR), str(test_dir)) ## Clone from path with an envvar works dest1 = tmp_path_factory.mktemp("dest") os.chdir(dest1) os.environ["TEST_DIR"] = str(test_dir) ys1 = YARsync(clone_command + ["clone", "$TEST_DIR"]) return_code = ys1() assert not return_code # remove the variable, so that it is no longer expanded in _config del os.environ["TEST_DIR"] ys11 = YARsync("yarsync remote show".split()) ys11() # it is TEST, because it is the name of origin. assert ys11._config["TEST"]["path"] == "$TEST_DIR" ## Clone to path with an envvar works dest2 = tmp_path_factory.mktemp("dest2") os.environ["DEST2"] = str(dest2) # we are still in dest1 ys2 = YARsync(["yarsync", "clone", "clone2", "$DEST2"]) return_code = ys2() assert not return_code del os.environ["DEST2"] ys21 = YARsync("yarsync remote show".split()) ys21() assert ys21._config["clone2"]["path"] == \ os.path.join("$DEST2", test_dir_name) yarsync-0.3.1/tests/test_commit.py000066400000000000000000000160051477154272500172660ustar00rootroot00000000000000import os import pytest import time from sys import version_info from yarsync import YARsync from .settings import TEST_DIR_EMPTY, YSDIR def test_commit(mocker): """Test commit creation and logging.""" os.chdir(TEST_DIR_EMPTY) # important that it goes before patches, need normal initialization commit_msg = "initial commit" args = ["yarsync", "commit", "-m", commit_msg] # we initialize ys here, but don't create a commit. ys = YARsync(args) # time.localtime uses time.time time_3 = time.localtime(3) def loctime(sec=None): return time_3 mocker.patch("time.localtime", loctime) # hope this will work in another time zone. mocker.patch("time.tzname", "MSK") # time.time is called a slight instant after time.localtime # their order is not important though. commit_time = 2 mocker.patch("time.time", lambda: commit_time) rename = mocker.patch("os.rename") mkdir = mocker.patch("os.mkdir") mocker.patch("socket.gethostname", lambda: "host") mocker.patch("getpass.getuser", lambda: "user") m = mocker.mock_open() # adapted from https://stackoverflow.com/a/38618056/952234 def my_open(filename, mode="r"): if mode != "r": return m.return_value if filename == ys.REPOFILE: content = "myhost" else: raise FileNotFoundError(filename) file_object = mocker.mock_open(read_data=content).return_value file_object.__iter__.return_value = content.splitlines(True) return file_object mocker.patch("builtins.open", new=my_open) popen = mocker.patch("subprocess.Popen") subprocess_mock = mocker.Mock() attrs = {'communicate.return_value': ('output', 'error')} subprocess_mock.configure_mock(**attrs) subprocess_mock.configure_mock(**{"returncode": 0}) popen.return_value = subprocess_mock commit_name = str(int(commit_time)) commit_dir = os.path.join(ys.COMMITDIR, commit_name) commit_dir_tmp = commit_dir + "_tmp" commit_log_path = os.path.join(ys.LOGDIR, commit_name + ".txt") commit_time_str = time.strftime(ys.DATEFMT, time.localtime()) # call _commit res = ys() filter_ = ys._get_filter(include_commits=False) call = mocker.call assert res == 0 assert mkdir.mock_calls == [ call(ys.COMMITDIR), call(ys.LOGDIR), ] assert rename.mock_calls == [ call(commit_dir_tmp, commit_dir), ] assert popen.mock_calls == [ call(["rsync", "-a", "--link-dest=../../..", "--exclude=/.ys"] + filter_ + [ys.root_dir + '/', os.path.join(ys.COMMITDIR, "2_tmp")], stdout=-3), call().communicate(), ] # this seems patched, but the date on Python 3.6 is still different assert time.tzname == "MSK" # if sys.version_info.minor <= 6: # # will be UTC # time_str = time.strftime(ys.DATEFMT, time_3) # else: # time_str = "Thu, 01 Jan 1970 03:00:03 MSK" time_str = time.strftime(ys.DATEFMT, time.localtime(3)) if version_info.minor >= 13: assert m.mock_calls == [ # call(ys.REPOFILE), # call().__enter__(), # call().read(), # call().__exit__(None, None, None), # call(commit_log_path, "w"), call().__enter__(), call().write(commit_msg + "\n\n" "When: {}\n".format(time_str) + "Where: user@myhost"), call().write('\n'), call().__exit__(None, None, None), call().close(), ] else: assert m.mock_calls == [ # call(ys.REPOFILE), # call().__enter__(), # call().read(), # call().__exit__(None, None, None), # call(commit_log_path, "w"), call().__enter__(), call().write(commit_msg + "\n\n" "When: {}\n".format(time_str) + "Where: user@myhost"), call().write('\n'), call().__exit__(None, None, None), ] @pytest.mark.parametrize("commit_time", [4, 0]) @pytest.mark.usefixtures("test_dir_separate_copy") def test_commit_with_limits(tmp_path, mocker, commit_time): """Test commit creation and logging.""" ys = YARsync(["yarsync", "commit", "--limit", "1"]) # initially there are two commits cloned assert set(os.listdir(ys.COMMITDIR)) == set(("1", "2")) # copied from test_commit # time.localtime uses time.time time_new = time.localtime(commit_time) def loctime(sec=None): return time_new # no idea why need two time patches (but see above about Python 3.6) mocker.patch("time.localtime", loctime) mocker.patch("time.tzname", "MSK") # commit_time = 2 # this patch is needed for Python 3.9 mocker.patch("time.time", lambda: commit_time) commit_name = str(int(commit_time)) commit_dir = os.path.join(ys.COMMITDIR, commit_name) # call _commit with limit res = ys() assert res == 0 # older commits were removed if commit_time > 2: assert os.listdir(ys.COMMITDIR) == [str(commit_time)] if commit_time < 1: assert os.listdir(ys.COMMITDIR) == ["2"] def test_commit_rsync_error(mocker): os.chdir(TEST_DIR_EMPTY) popen = mocker.patch("subprocess.Popen") subprocess_mock = mocker.Mock() attrs = {'communicate.return_value': ('output', 'error')} subprocess_mock.configure_mock(**attrs) # some error occurred in rsync subprocess_mock.configure_mock(returncode=1) popen.return_value = subprocess_mock # just in case. Now it won't be called, but to make code more stable. mocker.patch("os.mkdir") args = ["yarsync", "commit"] ys = YARsync(args) res = ys() assert res == 1 def test_existing_commit_exception(mocker): os.chdir(TEST_DIR_EMPTY) mocker.patch("time.time", lambda: 2) def _os_path_exists(filepath): if YSDIR in filepath and not filepath.endswith("MERGE.txt"): return True return False mocker.patch("os.path.exists", _os_path_exists) args = "yarsync commit".split() ys = YARsync(args) with pytest.raises(RuntimeError) as err: res = ys() # can't compare them directly assert repr(err.value) == repr(RuntimeError( "commit {} exists".format(os.path.join(ys.COMMITDIR, "2")) )) def test_existing_tmp_commit_exception(mocker): os.chdir(TEST_DIR_EMPTY) mocker.patch("time.time", lambda: 2) def _os_path_exists(filepath): if YSDIR in filepath: # print("path = ", filepath) return "_tmp" in filepath return False # initialization is fine, because the config file is present args = "yarsync commit".split() ys = YARsync(args) mocker.patch("os.path.exists", _os_path_exists) mocker.patch("os.mkdir") with pytest.raises(RuntimeError) as err: res = ys() assert repr(err.value) == repr(RuntimeError( "temporary commit {} exists".format(os.path.join(ys.COMMITDIR, "2_tmp")) )) yarsync-0.3.1/tests/test_dir/000077500000000000000000000000001477154272500162005ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/000077500000000000000000000000001477154272500167115ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/commits/000077500000000000000000000000001477154272500203645ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/commits/1/000077500000000000000000000000001477154272500205245ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/commits/1/a000066400000000000000000000000021477154272500206570ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir/.ys/commits/2/000077500000000000000000000000001477154272500205255ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/commits/2/a000066400000000000000000000000021477154272500206600ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir/.ys/commits/2/b000066400000000000000000000000021477154272500206610ustar00rootroot00000000000000b yarsync-0.3.1/tests/test_dir/.ys/commits/2/c/000077500000000000000000000000001477154272500207475ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/commits/2/c/c000066400000000000000000000000021477154272500211040ustar00rootroot00000000000000c yarsync-0.3.1/tests/test_dir/.ys/commits/2/c/d000066400000000000000000000000021477154272500211050ustar00rootroot00000000000000d yarsync-0.3.1/tests/test_dir/.ys/config.ini000066400000000000000000000000511477154272500206530ustar00rootroot00000000000000[other_repo] path = /path/to/other/repo yarsync-0.3.1/tests/test_dir/.ys/logs/000077500000000000000000000000001477154272500176555ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/logs/2.txt000066400000000000000000000000651477154272500205600ustar00rootroot00000000000000When: Thu, 01 Jan 1970 03:00:02 MSK Where: user@host yarsync-0.3.1/tests/test_dir/.ys/logs/3.txt000066400000000000000000000000061477154272500205540ustar00rootroot00000000000000log 3 yarsync-0.3.1/tests/test_dir/.ys/repo_TEST.txt000066400000000000000000000000001477154272500212440ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/sync/000077500000000000000000000000001477154272500176655ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/sync/2_TEST.txt000066400000000000000000000000001477154272500214140ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/.ys/sync/2_other_repo.txt000066400000000000000000000000001477154272500230030ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/a000066400000000000000000000000021477154272500163330ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir/b000066400000000000000000000000021477154272500163340ustar00rootroot00000000000000b yarsync-0.3.1/tests/test_dir/c/000077500000000000000000000000001477154272500164225ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir/c/c000066400000000000000000000000021477154272500165570ustar00rootroot00000000000000c yarsync-0.3.1/tests/test_dir/c/d000066400000000000000000000000021477154272500165600ustar00rootroot00000000000000d yarsync-0.3.1/tests/test_dir_config_dir/000077500000000000000000000000001477154272500203635ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_config_dir/commits/000077500000000000000000000000001477154272500220365ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_config_dir/commits/1/000077500000000000000000000000001477154272500221765ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_config_dir/commits/1/a000066400000000000000000000000001477154272500223270ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_config_dir/commits/2/000077500000000000000000000000001477154272500221775ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_config_dir/commits/2/a000066400000000000000000000000021477154272500223320ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir_config_dir/config.ini000066400000000000000000000000001477154272500223170ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_config_dir/logs/000077500000000000000000000000001477154272500213275ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_config_dir/logs/2.txt000066400000000000000000000000651477154272500222320ustar00rootroot00000000000000When: Thu, 01 Jan 1970 03:00:02 MSK Where: user@host yarsync-0.3.1/tests/test_dir_config_dir/logs/3.txt000066400000000000000000000000061477154272500222260ustar00rootroot00000000000000log 3 yarsync-0.3.1/tests/test_dir_config_dir/repo_test_config.txt000066400000000000000000000000001477154272500244430ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_config_dir/rsync-filter000066400000000000000000000000201477154272500227170ustar00rootroot00000000000000- omit.this - b yarsync-0.3.1/tests/test_dir_empty/000077500000000000000000000000001477154272500174165ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_empty/.ys/000077500000000000000000000000001477154272500201275ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_empty/.ys/config.ini000066400000000000000000000000001477154272500220630ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_empty/.ys/repo_myhost.txt000066400000000000000000000000011477154272500232270ustar00rootroot00000000000000 yarsync-0.3.1/tests/test_dir_empty/.ys/rsync-filter000066400000000000000000000000001477154272500224610ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/000077500000000000000000000000001477154272500175455ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/000077500000000000000000000000001477154272500202565ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/commits/000077500000000000000000000000001477154272500217315ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/commits/1/000077500000000000000000000000001477154272500220715ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/commits/1/a000066400000000000000000000000001477154272500222220ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/commits/2/000077500000000000000000000000001477154272500220725ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/commits/2/a000066400000000000000000000000021477154272500222250ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir_filter/.ys/config.ini000066400000000000000000000000001477154272500222120ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/logs/000077500000000000000000000000001477154272500212225ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/logs/2.txt000066400000000000000000000000651477154272500221250ustar00rootroot00000000000000When: Thu, 01 Jan 1970 03:00:02 MSK Where: user@host yarsync-0.3.1/tests/test_dir_filter/.ys/logs/3.txt000066400000000000000000000000061477154272500221210ustar00rootroot00000000000000log 3 yarsync-0.3.1/tests/test_dir_filter/.ys/repo_test_filter.txt000066400000000000000000000000001477154272500243560ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_filter/.ys/rsync-filter000066400000000000000000000000201477154272500226120ustar00rootroot00000000000000- omit.this - b yarsync-0.3.1/tests/test_dir_filter/a000066400000000000000000000000021477154272500177000ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir_filter/b000066400000000000000000000000021477154272500177010ustar00rootroot00000000000000b yarsync-0.3.1/tests/test_dir_read_only/000077500000000000000000000000001477154272500202345ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_read_only/empty.txt000066400000000000000000000000001477154272500221210ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_work_dir/000077500000000000000000000000001477154272500201005ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_work_dir/a000066400000000000000000000000021477154272500202330ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir_work_dir/b000066400000000000000000000000021477154272500202340ustar00rootroot00000000000000b yarsync-0.3.1/tests/test_dir_ys_bad_permissions/000077500000000000000000000000001477154272500221545ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/000077500000000000000000000000001477154272500226655ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/commits/000077500000000000000000000000001477154272500243405ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/commits/1/000077500000000000000000000000001477154272500245005ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/commits/1/a000066400000000000000000000000001477154272500246310ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/commits/2/000077500000000000000000000000001477154272500245015ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/commits/2/a000066400000000000000000000000021477154272500246340ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/commits/not-a-commit/000077500000000000000000000000001477154272500266445ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/commits/not-a-commit/git-dummy000066400000000000000000000000001477154272500304710ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/config.ini000066400000000000000000000000001477154272500246210ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/logs/000077500000000000000000000000001477154272500236315ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/logs/2.txt000066400000000000000000000000651477154272500245340ustar00rootroot00000000000000When: Thu, 01 Jan 1970 03:00:02 MSK Where: user@host yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/logs/3.txt000066400000000000000000000000061477154272500245300ustar00rootroot00000000000000log 3 yarsync-0.3.1/tests/test_dir_ys_bad_permissions/.ys/repo_TEST.txt000066400000000000000000000000001477154272500252200ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/a000066400000000000000000000000021477154272500223070ustar00rootroot00000000000000a yarsync-0.3.1/tests/test_dir_ys_bad_permissions/b000066400000000000000000000000021477154272500223100ustar00rootroot00000000000000b yarsync-0.3.1/tests/test_dir_ys_bad_permissions/forbidden/000077500000000000000000000000001477154272500241105ustar00rootroot00000000000000yarsync-0.3.1/tests/test_dir_ys_bad_permissions/forbidden/c000066400000000000000000000000001477154272500242430ustar00rootroot00000000000000yarsync-0.3.1/tests/test_env_vars.py000066400000000000000000000006311477154272500176170ustar00rootroot00000000000000import os from yarsync.yarsync import _substitute_env def test_env_vars(): os.environ["VAR"] = "var" # variable substitution works for several lines lines = "[]\npath=$VAR/maybe" assert _substitute_env(lines).getvalue() == "[]\npath=var/maybe" # unset variable is not substituted lines = "[]\npath=$VAR1/maybe" assert _substitute_env(lines).getvalue() == "[]\npath=$VAR1/maybe" yarsync-0.3.1/tests/test_init.py000066400000000000000000000106731477154272500167460ustar00rootroot00000000000000# Mock OS functions and check that they are called properly import os from sys import version_info from yarsync import YARsync from yarsync.yarsync import CONFIG_EXAMPLE from .settings import YSDIR def test_init_mixed(mocker): ## Mock existing directory and non-existent files ## args = "yarsync init".split() ys0 = YARsync(args) conffile = ys0.CONFIGFILE repofile = ys0.REPOFILE.format("my_repo") def _os_path_exists(filepath): if filepath == YSDIR: return True elif filepath == conffile: return True elif filepath.startswith(YSDIR): print(filepath) return False else: return False # won't get access to real os.path.exists(filepath) m = mocker.mock_open() mocker.patch("builtins.open", m) # the user inputs "my_host" mocker.patch("builtins.input", lambda _: "my_repo") mocker.patch("os.path.exists", _os_path_exists) # otherwise listdir will complain that .ys doesn't exist mocker.patch("os.listdir", lambda _: []) # call _init res = ys0() assert res == 0 call = mocker.call if version_info.minor >= 13: assert m.mock_calls == [ # call(conffile, "w"), call().__enter__(), # call().write(CONFIG_EXAMPLE), call().write(''), # call().__exit__(None, None, None), call(repofile, "x"), call().__enter__(), call().__exit__(None, None, None), call().close() ] else: assert m.mock_calls == [ # call(conffile, "w"), call().__enter__(), # call().write(CONFIG_EXAMPLE), call().write(''), # call().__exit__(None, None, None), call(repofile, "x"), call().__enter__(), call().__exit__(None, None, None) ] # old_calls = m.mock_calls[:] # clear the calls m.reset_mock() # the user inputs nothing, and hostname is taken mocker.patch("builtins.input", lambda _: "") mocker.patch("socket.gethostname", lambda: "my_host") # don't forget to call the function ys1 = YARsync(["yarsync", "init"]) ys1._init() repofile = ys1.REPOFILE.format("my_host") assert call(repofile, "x") in m.mock_calls def test_init_non_existent(mocker): def _os_path_exists(filepath): return False m = mocker.mock_open() mocker.patch("builtins.open", m) mocker.patch("os.path.exists", _os_path_exists) mkdir = mocker.patch("os.mkdir") mocker.patch("os.listdir", lambda _: []) args = "yarsync init myhost".split() ys = YARsync(args) conffile = ys.CONFIGFILE repofile = ys.REPOFILE.format("myhost") res = ys() assert res == 0 call = mocker.call assert mkdir.mock_calls == [call(YSDIR)] # assert mkdir.mock_calls == [call(YSDIR, ys.DIRMODE)] if version_info.minor >= 13: assert m.mock_calls == [ # mkdir is recorded separately call(conffile, "w"), call().__enter__(), call().write(CONFIG_EXAMPLE), call().write(''), call().__exit__(None, None, None), call().close(), call(repofile, "x"), call().__enter__(), call().__exit__(None, None, None), call().close(), ] else: assert m.mock_calls == [ # mkdir is recorded separately call(conffile, "w"), call().__enter__(), call().write(CONFIG_EXAMPLE), call().write(''), call().__exit__(None, None, None), call(repofile, "x"), call().__enter__(), call().__exit__(None, None, None) ] def test_init_existent(mocker): def _os_path_exists(filepath): # assume only files within YSDIR exist, # otherwise you'll have problems with gettext if os.path.commonprefix([filepath, YSDIR]) == YSDIR: return True # otherwise os.path.exists(filepath) # would cause infinite recursion here return False args = "yarsync init myhost".split() m = mocker.mock_open() mocker.patch("builtins.open", m) # no input is prompted when we provide repo name in CL args input_ = mocker.patch("builtins.input") mocker.patch("os.path.exists", _os_path_exists) mocker.patch("os.listdir", lambda _: ["repo_myhost.txt"]) mkdir = mocker.patch("os.mkdir") ys = YARsync(args) res = ys() assert res == 0 assert input_.mock_calls == [] assert mkdir.mock_calls == [] assert m.mock_calls == [] yarsync-0.3.1/tests/test_log.py000066400000000000000000000061211477154272500165550ustar00rootroot00000000000000import os import pytest import time from yarsync import YARsync from .settings import TEST_DIR, TEST_DIR_EMPTY def test_log_error(test_dir_read_only): """Test a not-yarsync directory.""" os.chdir(test_dir_read_only) args = ["yarsync", "log"] with pytest.raises(OSError) as err: ys = YARsync(args) # the exact representation of value and the printed error message # are tested in test_status def test_log_empty(mocker): os.chdir(TEST_DIR_EMPTY) mocker_print = mocker.patch("sys.stdout") args = ["yarsync", "log"] ys = YARsync(args) # call _log res = ys() call = mocker.call assert res == 0 assert mocker_print.mock_calls == [ call.write('No synchronization directory found.'), call.write('\n'), call.write('No synchronization information found.'), call.write('\n'), call.write('No commits found'), call.write('\n') ] def test_log(mocker): os.chdir(TEST_DIR) mocker_print = mocker.patch("sys.stdout") args = ["yarsync", "log"] ys = YARsync(args) # call _log res = ys() call = mocker.call assert res == 0 # # the date on Python 3.6 is still different (see test_commit.py) # if sys.version_info.minor <= 6: # # will be UTC # time1_str = time.strftime(ys.DATEFMT, time.localtime(1)) # else: # time1_str = "Thu, 01 Jan 1970 03:00:01 MSK" time1_str = time.strftime(ys.DATEFMT, time.localtime(1)) # this time is fixed in log time2_str = "Thu, 01 Jan 1970 03:00:02 MSK" assert mocker_print.mock_calls == [ # todo: missing synchronization should be tested somewhere. # call.write('No synchronization directory found.'), # call.write('\n'), # call.write('No synchronization information found.'), # call.write('\n'), call.write('commit 3 is missing'), call.write('\n'), call.write('log 3\n'), call.write(''), call.write('\n'), call.write('commit 2 <-> other_repo'), call.write('\n'), call.write('When: {}\nWhere: user@host\n'.format(time2_str)), call.write(''), call.write('\n'), call.write('commit 1'), call.write('\n'), call.write('Log is missing\nWhen: {}\n'.format(time1_str)), call.write(''), ] mocker_print.reset_mock() # yarsync log -n 1 -r args = ["yarsync", "log", "--max-count", "1", "--reverse"] ys = YARsync(args) res = ys() call = mocker.call assert res == 0 assert mocker_print.mock_calls == [ # call.write('No synchronization directory found.'), # call.write('\n'), # call.write('No synchronization information found.'), # call.write('\n'), call.write('commit 1'), call.write('\n'), call.write('Log is missing\nWhen: {}\n'.format(time1_str)), call.write(''), ] def test_make_commit_log_list(): commits = [1, 3] logs = [2] ys = YARsync(["yarsync", "log"]) # the function is not called assert ys._make_commit_list(commits, logs) == [(1, None), (None, 2), (3, None)] yarsync-0.3.1/tests/test_pull_push.py000066400000000000000000000074211477154272500200130ustar00rootroot00000000000000import os import pytest from yarsync.yarsync import YARsync, COMMAND_ERROR from .helpers import clone_repo from .settings import ( TEST_DIR, TEST_DIR_YS_BAD_PERMISSIONS, ) # run tests for each combination of arguments # https://docs.pytest.org/en/latest/how-to/parametrize.html#pytest-mark-parametrize-parametrizing-test-functions @pytest.mark.parametrize("pull", [True, False]) @pytest.mark.parametrize("dry_run", [True, False]) def test_pull_push_uncommitted( capfd, origin_test_dir, pull, dry_run, ): """Pull and push always fail when there are uncommitted changes.""" os.chdir(TEST_DIR_YS_BAD_PERMISSIONS) command = ["yarsync"] if pull: command.append("pull") else: command.append("push") if dry_run: command.append("--dry-run") ys = YARsync(command + ["origin"]) # remote "origin" is added in origin_test_dir. returncode = ys() assert returncode == COMMAND_ERROR captured = capfd.readouterr() assert "local repository has uncommitted changes" in captured.err assert "Changed since head commit:\n" in captured.out @pytest.mark.parametrize("backup_dir", [True, False]) def test_backup(tmp_path_factory, backup_dir): local_path = tmp_path_factory.mktemp("local") source_path = tmp_path_factory.mktemp("repo") local = local_path.__str__() source = source_path.__str__() ## clone TEST_DIR -> source -> local clone_repo(TEST_DIR, source) # strange, why we first enter source, then local... # Better call them different names then. os.chdir(source) # we make a real yarsync clone just to have origin. YARsync(["yarsync", "-qq", "clone", "origin", local])() print("created yarsync repositories {} and {}".format(source, local)) # adjust the real path to the repo source_name = os.path.basename(source) local_path = local_path / source_name # corrupt some local files local_a = local_path / "a" local_a.write_text("b\n") local_d = local_path / "c" / "d" local_d.write_text("c\n") os.chdir(local) YARsync(["yarsync", "init"])() YARsync(["yarsync", "remote", "add", "origin", source])() ys_push = YARsync(["yarsync", "push", "origin"]) # if you have problems during push because of uncommitted changes, # this might be because of hard links broken by git. ys_push() source_a = source_path / "a" # no evil was transferred! # it won't be transferred after rsync is improved, # https://github.com/WayneD/rsync/issues/357 # assert source_a.read_text() == "a\n" ys_command = ["yarsync", "pull"] if backup_dir: ys_command.extend(["--backup-dir", "BACKUP"]) else: ys_command.append("--backup") ys_command.append("origin") ys_pull_backup = YARsync(ys_command) ys_pull_backup() files = os.listdir() # the correctness was transferred back again! # destination files are renamed # *** fix after https://github.com/WayneD/rsync/issues/357 # assert local_a.read_text() == "a\n" # *** fix # if backup_dir: # # there are two nested BACKUP-s: probably an rsync bug... # bd = pathlib.Path(".") / "BACKUP" / "BACKUP" # assert set(files) == set(("a", "b", ".ys", "c", "BACKUP")) # # old corrupt a is saved here # assert (bd / "a").read_text() == "b\n" # # the real hierarchy is backed up # assert (bd / "c" / "d").read_text() == "c\n" # else: # assert set(files) == set(("a", "a~", "b", ".ys", "c")) # # and the wrongdoings were preserved as well # assert (local_path / "a~").read_text() == "b\n" # assert (local_path / "c" / "d~").read_text() == "c\n" # we can't pull or push in an updated state # *** fix # assert ys_pull_backup._status(check_changed=True)[1] is True yarsync-0.3.1/tests/test_remote.py000066400000000000000000000007701477154272500172730ustar00rootroot00000000000000import pytest from yarsync.yarsync import YARsync from yarsync.yarsync import ( COMMAND_ERROR ) @pytest.mark.usefixtures("test_dir") def test_remote_add(capfd): remote_command = "yarsync remote add".split() # adding existing remote fails ys = YARsync(remote_command + ["other_repo", "/some/path/"]) returncode = ys() assert returncode == COMMAND_ERROR captured = capfd.readouterr() assert "remote other_repo exists, break." in captured.err assert not captured.out yarsync-0.3.1/tests/test_status.py000066400000000000000000000120121477154272500173130ustar00rootroot00000000000000import os import pytest from yarsync import YARsync from yarsync.yarsync import _Sync from .settings import ( TEST_DIR, TEST_DIR_EMPTY, TEST_DIR_CONFIG_DIR, TEST_DIR_WORK_DIR, TEST_DIR_FILTER, TEST_DIR_YS_BAD_PERMISSIONS, ) def test_status_error(mocker, test_dir_read_only): ## test directory without .ys configuration os.chdir(test_dir_read_only) mocker_stdout = mocker.patch("sys.stdout") mocker_stderr = mocker.patch("sys.stderr") call = mocker.call args = ["yarsync", "status"] # issues a mocker warning # with mocker.patch("sys.stderr") as mocker_stderr: with pytest.raises(OSError) as err: ys = YARsync(args) assert ".ys not found" in repr(err.value) # adapted from https://stackoverflow.com/a/59398826/952234 write_calls = mocker_stderr.write.call_args_list # [0] is call args, [1] is kwargs written_strs = "".join(call[0][0] for call in write_calls) # error message is correct assert written_strs.startswith( "! fatal: no yarsync configuration directory .ys found\n" ) # no stdout output assert mocker_stdout.mock_calls == [] # don't test for exact messages, # because we might improve them in the future. # assert mocker_print.mock_calls == [ # call.write('!'), # call.write(' '), # call.write("fatal: no yarsync configuration " # ".ys found"), # call.write('\n') # ] def test_status_error_bad_permissions(capfd): os.chdir(TEST_DIR_YS_BAD_PERMISSIONS) ys = YARsync(["yarsync", "status"]) returncode = ys() # rsync returns 23 in case of permission errors assert returncode == 23 # mock will not work with non-Python stderr, # https://github.com/pytest-dev/pytest-mock/issues/295#issuecomment-1155105804 # so we use capfd # https://docs.pytest.org/en/stable/how-to/capture-stdout-stderr.html#accessing-captured-output-from-a-test-function # https://docs.pytest.org/en/stable/reference/reference.html#capfd captured = capfd.readouterr() assert 'test_dir_ys_bad_permissions/forbidden" failed: Permission denied '\ in captured.err assert "No synchronization information found." in captured.out def test_status_no_commits(mocker): os.chdir(TEST_DIR_EMPTY) # io.StringIO uses only utf-8 mocker_print = mocker.patch("sys.stdout") #, new_callable=StringIO) args = ["yarsync", "status"] ys = YARsync(args) res = ys() call = mocker.call assert res == 0 assert mocker_print.mock_calls == [ call.write('In repository myhost'), call.write('\n'), call.write('No commits found'), call.write('\n') ] @pytest.mark.parametrize( "command,test_dir", [ (["yarsync"], TEST_DIR_FILTER), (["yarsync", "--config-dir="+TEST_DIR_CONFIG_DIR], TEST_DIR_WORK_DIR), ] ) def test_status_existing_commits(capfd, command, test_dir): """Status works same for a normal and a detached configuration.""" os.chdir(test_dir) # mocker_print = mocker.patch("sys.stdout") command.append("status") ys = YARsync(command) res = ys() # we don't check for the exact rsync message here, only for results. # # filter is needed, because not only .ys can be excluded assert res == 0 ## stdout is correct # we don't test for exact lines, because they can change # when we e.g. add and remove a file to a subdirectory # (thus changing its timestamps) captured = capfd.readouterr() assert not captured.err assert captured.out.endswith( 'Nothing to commit, working directory clean.\n' 'No synchronization directory found.\n' 'No synchronization information found.\n' ) # assert mocker_print.mock_calls == [ # # this is written only with -v # # call.write('# '), # # call.write(''), # # call.write( # # "rsync -aun --delete -i --exclude=/.ys {} --outbuf=L {}/ {}/commits/2"\ # # .format(filter_str, ys.root_dir, ys.config_dir) # # ), # # call.write('\n'), # call.write('Nothing to commit, working directory clean.'), # call.write('\n'), # call.write('No synchronization information found.'), # call.write('\n'), # ] @pytest.mark.parametrize( "local_sync", [ (["1_other_repo.txt"]), (["2_other_repo.txt", "1_other_repo.txt"]), ] ) def test_status_existing_sync(mocker, capfd, local_sync): n_commits_ahead = 1 other_repo = "other_repo" os.chdir(TEST_DIR) ys = YARsync(["yarsync", "status"]) sync = _Sync(local_sync) mock = mocker.Mock() mock.return_value = sync ys._get_local_sync = mock res = ys() captured = capfd.readouterr() assert not captured.err if len(local_sync) == 1: sync_str = "Local repository is {} commits ahead of {}\n"\ .format(n_commits_ahead, other_repo) assert sync_str in captured.out else: sync_str = "Commits are up to date with {}.\n".format(other_repo) assert sync_str in captured.out yarsync-0.3.1/tests/test_sync.py000066400000000000000000000031201477154272500167440ustar00rootroot00000000000000import pytest from yarsync.yarsync import _Sync as Sync, YSConfigurationError def test_sync(): sync_list0 = [ "1_a", "1_b2", "2_c", "2_a" ] s0 = Sync([s + ".txt" for s in sync_list0]) assert s0.by_repos == { "a": 2, "b2": 1, "c": 2, } assert s0.by_commits() == { 2: set(("c", "a")), 1: set(("b2",)), } sync_list1 = ["1_a", "2_b2", "3_dd"] s1 = Sync([s + ".txt" for s in sync_list1]) s0.update(s1.by_repos.items()) repos1 = { "a": 2, "b2": 2, "c": 2, "dd": 3, } assert s0.by_repos == repos1 assert s0.by_commits() == { 2: set(("c", "a", "b2")), 3: set(("dd",)), } # assert s0.repos == frozenset(("a", "b", "c")) # tuple, otherwise set contains characters from the string removed1 = set(("1_b2.txt",)) # note that 1_a is not removed, because it is was not checked. # It will be eventually removed after pushes/pulls. assert s0.removed == removed1 new1 = set(("2_b2.txt", "3_dd.txt")) assert s0.new == new1 # update is idempotent (doesn't change new and removed, as well as sync) s0.update(s1.by_repos.items()) assert s0.by_repos == repos1 assert s0.removed == removed1 assert s0.new == new1 # incorrect commit number raises with pytest.raises(YSConfigurationError): Sync(["a_a.txt"]) def test_sync_bool(): # False for an empty synchronization object assert not Sync([]) # True if there is data sync_list = ["1_a.txt"] assert Sync(sync_list) yarsync-0.3.1/tests/test_ys_helpers.py000066400000000000000000000123141477154272500201520ustar00rootroot00000000000000"""Test various small yarsync commands and configuration.""" import os import pytest from yarsync import YARsync from yarsync.yarsync import _is_commit, _substitute_env from yarsync.yarsync import ( CONFIG_EXAMPLE, YSConfigurationError ) from .settings import TEST_DIR def test_config(tmp_path): # Test real configuration file # example configuration is written during the initialisation os.chdir(tmp_path) ys = YARsync(["yarsync", "init"]) ys() config_filename = ys.CONFIGFILE with open(config_filename) as config: assert config.read() == CONFIG_EXAMPLE # wrong configuration raises wrong_config = """\ [remote1] path = / # duplicate sections are forbidden [remote1] path = / """ with open(config_filename, "w") as config: config.write(wrong_config) # config is used only by pull, push and remote add. # So it should be rather remotes.ini # But we don't have any other config then. with pytest.raises(YSConfigurationError) as err: YARsync(["yarsync", "pull", "-n", "remote1"]) # err is not a YSConfigurationError, but a pytest.ExceptionInfo, # that is why we take err.value assert "DuplicateSectionError" in err.value.msg assert "[line 4]: section 'remote1' already exists" in err.value.msg def test_read_config(): os.chdir(TEST_DIR) # we need to initialise one object to test its method ys = YARsync(["yarsync", "status"]) # actual configuration is ignored here config_with_default = """\ [DEFAULT] host_from_section_name [myremote] path = / [mylocal] # localhost host = path = / """ config, config_dict = ys._read_config(config_with_default) # path is unchanged. Changes are in the destpath. assert config_dict["myremote"]["path"] == "/" assert config_dict["myremote"]["destpath"] == "myremote:/" # section host overwrites the default host. assert config_dict["mylocal"]["destpath"] == "/" config_without_default = """\ [myremote] path = / [mylocal] # localhost host = path = / [myremote1] host = myremote1 path = / [myremote2] path = myremote2:/ """ config, config_dict = ys._read_config(config_without_default) # oops, remote is considered a local path now... assert config_dict["myremote"]["destpath"] == "/" assert config_dict["mylocal"]["destpath"] == "/" # but we can provide a correct host: assert config_dict["myremote1"]["destpath"] == "myremote1:/" # or a correct path: assert config_dict["myremote2"]["destpath"] == "myremote2:/" def test_env_vars(): os.environ["VAR"] = "var" # variable substitution works for several lines lines = "[]\npath=$VAR/maybe" assert _substitute_env(lines).getvalue() == "[]\npath=var/maybe" # unset variable is not substituted lines = "[]\npath=$VAR1/maybe" assert _substitute_env(lines).getvalue() == "[]\npath=$VAR1/maybe" def test_error(test_dir_read_only): os.chdir(test_dir_read_only) ys = YARsync(["yarsync", "init"]) # error messages are reasonable, so we don't test them here. returncode = ys() assert returncode == 8 def test_is_commit(): assert _is_commit("1") is True assert _is_commit("01") is True assert _is_commit("abc") is False def test_print(mocker): # ys must be initialised with some settings. os.chdir(TEST_DIR) mocker_print = mocker.patch("sys.stdout") call = mocker.call # default verbosity args = ["yarsync", "log"] ys = YARsync(args) # command is not called assert ys.print_level == ys._default_print_level ys._print("debug", level=2) assert mocker_print.mock_calls == [ call.write('debug'), call.write('\n') ] mocker_print.reset_mock() # verbose args = ["yarsync", "-v", "log"] ys = YARsync(args) ys._print("debug", level=3) assert mocker_print.mock_calls == [ call.write('# '), call.write(''), call.write('debug'), call.write('\n') ] mocker_print.reset_mock() # decrease verbosity ys.print_level = 1 ys._print("general") assert mocker_print.mock_calls == [ call.write('general'), call.write('\n') ] mocker_print.reset_mock() # print level higher than that of the YARsync object ys._print("debug unavailable", level=2) assert mocker_print.mock_calls == [] def test_print_command(capfd): os.chdir(TEST_DIR) args = ["yarsync", "log"] ys = YARsync(args) # command is not called # any string command is printed as it is command_str = "just a string, will be unchanged" ys._print_command(command_str) captured = capfd.readouterr() assert not captured.err assert command_str + '\n' == captured.out command = ["this", "should be quoted"] ys._print_command(command) captured = capfd.readouterr() assert not captured.err assert captured.out == "this 'should be quoted'\n" def test_version(capfd): from yarsync.yarsync import __version__ try: YARsync(["yarsync", "--version"]) except SystemExit as err: assert err.code == 0 captured = capfd.readouterr() assert not captured.err assert captured.out == "yarsync version " + __version__ + "\n" yarsync-0.3.1/tox.ini000066400000000000000000000012441477154272500145350ustar00rootroot00000000000000# tox (https://tox.readthedocs.io/) is a tool for running tests # in multiple virtualenvs. This configuration file will run the # test suite on all supported python versions. To use it, "pip install tox" # and then run "tox" from this directory. # To run for a specific version: # tox -e py37 [tox] envlist = py37, py38, py39, py310, py311, py312, py313, pypy3 # We need this in order to test end of life python version e.g py36, py37 # requires = virtualenv < 20.22.0 # Tested on Python 3.6 on another host, because of this bug with libffi: # https://github.com/microsoft/azuredatastudio/issues/10429 [testenv] deps = pytest pytest-mock commands = pytest tests yarsync-0.3.1/yarsync/000077500000000000000000000000001477154272500147115ustar00rootroot00000000000000yarsync-0.3.1/yarsync/__init__.py000066400000000000000000000004451477154272500170250ustar00rootroot00000000000000"""a file synchronization and backup tool. To list available commands, run $ yarsync --help Read YARsync manual for complete documentation. https://github.com/ynikitenko/yarsync """ # otherwise one would have to write 'from yarsync.yarsync import YARsync' from .yarsync import YARsync yarsync-0.3.1/yarsync/version.py000066400000000000000000000003031477154272500167440ustar00rootroot00000000000000# don't forget to update the version in docs/source/yarsync.1.md manually, # as well as copyright years in Sphinx conf # and then to re-generate the man file with 'make man' __version__ = '0.3' yarsync-0.3.1/yarsync/yarsync.py000066400000000000000000003274601477154272500167670ustar00rootroot00000000000000# Yet Another Rsync is a file synchronization tool import argparse import configparser import functools # for user name import getpass import io import os import re # rmtree import shutil # for host name import socket import subprocess import sys import time from .version import __version__ ######################## ### MODULE CONSTANTS ### ######################## ## Return codes ## # argparse error (raised during YARsync.__init__) SYNTAX_ERROR = 1 # ys configuration error: not in a repository, # a config file missing, etc. CONFIG_ERROR = 7 # ys command error: the repository is correct, # but in a state forbidding the action # (for example, one can't push when there are uncommitted changes # or add a remote with an already present name) COMMAND_ERROR = 8 # Python interpreter could get KeyboardInterrupt # or other exceptions leading to sys.exit() SYS_EXIT_ERROR = 9 ## Custom exception classes ## class YSError(Exception): """Base for all yarsync exceptions.""" pass class YSConfigurationError(YSError): def __init__(self, arg="", msg=""): self.arg = arg self.msg = msg class YSCommandError(YSError): """Can be raised if a yarsync command was not successful.""" def __init__(self, code=0): # code might be unimportant, therefore we allow 0 self.code = code ## command line arguments errors class YSArgumentError(YSError): # don't know how to initialize an argparse.ArgumentError, # so create a custom exception def __init__(self, arg="", msg=""): self.arg = arg self.msg = msg class YSUnrecognizedArgumentsError(YSError, SystemExit): def __init__(self, code): # actually the code is never used later SystemExit.__init__(self, code) # super(YSUnrecognizedArgumentsError, self).__init__(code) ## Example configuration ## CONFIG_EXAMPLE = """\ # uncomment and edit sections or use # $ yarsync remote add # to add a remote. # # [remote] # # comments are allowed # path = remote:/path/on/remote/to/my_repo # # spaces around '=' are allowed # # spaces in section names are not allowed # # (will be treated as part of the name) # # # several sections are allowed # [my_drive] # path = $MY_DRIVE/my_repo # # inline comments are not allowed # # don't write this! # # host = # empty host means localhost # # this is correct: # # empty host means localhost # host = # # Variables in paths are allowed. # For them to take the effect, run # $ export MY_DRIVE=/run/media/my_drive # $ yarsync push -n my_drive # Always try --dry-run first to ensure # that all paths still exist and are correct! """ ###################### ## Helper functions ## ###################### def _check_positive(value): """Convert a string *value* to a natural number or raise.""" # based on https://stackoverflow.com/a/14117511/952234 err = argparse.ArgumentTypeError("must be a natural number") try: natural_num = int(value) except ValueError: raise err if natural_num <= 0: raise err return natural_num def _get_repo_name_if_exists(file_list=None, config_dir=""): # separate function, because used by several classes """*file_list* is a list of configuration files in YSDIR. For a local repository it is automatically obtained here. """ if file_list is None: file_list = os.listdir(config_dir) reponame = None # a list for generality, if we are doing several clones cloning_to = [] for fil in file_list: # format of CLONETOFILE if fil.startswith("CLONE_TO_") and fil.endswith(".txt"): print("cloning to {}".format(fil[9:-4])) cloning_to.append(fil[9:-4]) for fil in file_list: # format of REPOFILE if fil.startswith("repo_") and fil.endswith(".txt"): # if we are cloning there, it is not this repo name. _reponame = fil[5:-4] if _reponame in cloning_to: continue if reponame is not None: err_msg = ( "several repository names found, {} and {} . "\ .format(reponame, fil) + "They could be left after clone" ) _print_error(err_msg) raise YSConfigurationError(err_msg) reponame = _reponame # todo: assert reponame return reponame def _get_root_directory(config_dir_name): """Search for a directory containing *config_dir_name* higher in the file system hierarchy. """ cur_path = os.getcwd() # path without symlinks root_path = os.path.realpath(cur_path) # git stops at the file system boundary, # but we ignore that for now. while True: test_path = os.path.join(root_path, config_dir_name) if os.path.exists(test_path): # without trailing slash return root_path if os.path.dirname(root_path) == root_path: # won't work on Windows shares with '\\server\share', # but ignore for now. # https://stackoverflow.com/a/10803459/952234 break root_path = os.path.dirname(root_path) raise OSError( "configuration directory {} not found".format(config_dir_name) ) def _is_commit(file_name): """A *file_name* is a commit if it can be converted to int.""" try: int(file_name) except (TypeError, ValueError): return False return True def _is_remote(path): """A path is remote (for rsync) if ':' goes before '/'.""" # todo: change to _get_remote_host, which returns "" for a local one host_sep = path.find(':') if host_sep == -1: # local return False # or '\' for Windows first_dir_sep = path.find('/') if first_dir_sep != -1: # if host_sep > first_dir_sep, # then our local path contain a colon return host_sep < first_dir_sep else: # no path separator return True def _print_error(msg): # todo: allow arbitrary number of arguments. # not a class method, because it can be run # when YARsync was not fully initialized yet. print("!", msg, file=sys.stderr) # copied with some modifications from # https://github.com/DiffSK/configobj/issues/144#issuecomment-347019778 # Another proposed option is # config = ConfigParser(os.environ), # which is awful and unsafe, # because it adds all the environment to the configuration def _substitute_env(content): """Read filename, substitute environment variables and return a file-like object of the result. Substitution maps text like "$FOO" for the environment variable "FOO". """ def lookup(match): """Replace a match like $FOO with the env var FOO. """ key = match.group(2) if key not in os.environ: # variables should be set for values that are used, # but not necessarily for all values. # raise OSError("variable {} unset".format(key)) # unset variables return unchanged (with $) return match.group(1) # raise Exception("Config env var '{key}' not set".format(key)) return os.environ.get(key) # todo: allow more sophisticated variables, like ${VAR} # (and that's all), should be an OR of this and # r'(\${(\w+)})'), untested. # Not sure it's needed: why such complications to a config file?.. pattern = re.compile(r'(\$(\w+))') replaced = pattern.sub(lookup, content) try: result = io.StringIO(replaced) except TypeError: # Python2 result = io.StringIO(unicode(replaced, "utf-8")) return result def _mkhostpath(host, path): if host: return host + ":" + path # localhost return path #################### ## Helper classes ## #################### class _Config(): """Store configuration for different replicas.""" def __init__(self, file_list, allow_empty=False): commits = [] try: cmts = file_list["commits"] except KeyError: pass else: for comm in cmts: try: commit = int(comm) except ValueError: continue # this is not a crucial error. # Maybe we'd like to store there a symlink "head"? # Neither is this checked in local commits. # raise OSError( # "not a commit found on remote: {}".format(comm) # ) commits.append(commit) if "sync" in file_list: sync = _Sync(file_list["sync"]) else: sync = _Sync([]) self.repo_name = _get_repo_name_if_exists(file_list=file_list) if not self.repo_name and not allow_empty: raise YSConfigurationError( msg="Could not find repository name. " "Provide one with init." ) self.commits = commits self.sync = sync self._file_list = file_list class _Sync(): """Manage synchronizations for different repositories. Public fields: by_repos. """ def __init__(self, sync_list): """*sync_list* is a list of syncronization files in a format _ . """ br = {} for s in sync_list: commit, repo = s.split("_", maxsplit=1) repo = repo[:-4] # remove .txt extension try: commit = _check_positive(commit) except argparse.ArgumentTypeError: raise YSConfigurationError( msg="commit must be a natural number. {} found"\ .format(commit) ) # for each repository, store the most recent # synchronized commit. if repo in br: br[repo] = max(commit, br[repo]) else: br[repo] = commit # to quickly get all synchronized commits, use br.values() self.SYNCSTR = "{}_{}.txt" # .format(commit, repo) self.by_repos = br # outdated commits from other to be removed # sets, because dictionary iteration is arbitrary self.removed = set() self.new = set() # self.repos = frozenset(br) # keys def __bool__(self): # dictionary is True <=> non-empty return bool(self.by_repos) # note that by_commit() is a function, # while by_repos is a field. This may be confusing, # but it reflects their computational difference. def by_commits(self): bc = {} for repo, commit in self.by_repos.items(): if commit in bc: bc[commit].add(repo) else: bc[commit] = set((repo,)) return bc def get_synced_repos_for(self, commit, exclude_repo=""): return [repo for repo in self.by_commits()[commit] if repo != exclude_repo] def remove_repo(self, repo): # repo is removed only from a clean state assert not self.new and not self.removed commit = self.by_repos[repo] sync_str = self.SYNCSTR.format(commit, repo) self.removed.add(sync_str) def update(self, other): """Update synchronization information with that from *other*. *other* is an iterable of (commit, repo) pairs, for example, *_Sync.by_repos.items()*. """ local = self.by_repos new = self.new removed = self.removed _syncstr = self.SYNCSTR for repo, commit in other: sync_str = _syncstr.format(commit, repo) if repo in local: if commit > local[repo]: # remove outdated local synchronization local_sync_str = _syncstr.format(local[repo], repo) removed.add(local_sync_str) local[repo] = commit new.add(sync_str) # we don't delete synchronization # that is not present locally # elif commit < local[repo]: # removed.add(sync_str) else: local[repo] = commit new.add(sync_str) class YARsync(): """Synchronize data. Provide configuration and wrap rsync calls.""" def __init__(self, argv): """*argv* is the list of command line arguments.""" parser = argparse.ArgumentParser( description="yarsync is a file synchronization and backup tool", # exit_on_error appeared only in Python 3.9 # and doesn't seem to work. Skip and be more cross-platform. # exit_on_error=False ) # failed to implement that with ArgumentError # parser = _ErrorCatchingArgumentParser(...) subparsers = parser.add_subparsers( title="Available commands", dest="command_name", # description="valid commands", help="type 'yarsync --help' for additional help", # or it will print a list of commands in curly braces. metavar="command", ) ################################### ## Initialize optional arguments ## ################################### # .ys directory parser.add_argument("--config-dir", default="", help="path to the configuration directory") parser.add_argument("--root-dir", default="", help="path to the root of the working directory") # this option is applied not to all commands. # Moreover, we can't write "yarsync -n 2 log" => # don't create an illusion # that we can put an option at any place. # However, the upside of leaving it here might be # its better visibility during the general help # (not for a subcommand). # parser.add_argument( # "-n", "--dry-run", action="store_true", # default=False, # help="print what would be transferred during a real run, " # "but do not make any changes" # ) verbose_group = parser.add_mutually_exclusive_group() verbose_group.add_argument("-q", "--quiet", action="count", # otherwise default will be None default=0, help="decrease verbosity") verbose_group.add_argument("-v", "--verbose", action="count", default=0, help="increase verbosity") # this is not an option, but more like a separate command parser.add_argument("--version", "-V", action="store_true", help="print version") ############################ ## Initialize subcommands ## ############################ # or sub-commands # checkout # parser_checkout = subparsers.add_parser( "checkout", # help="check out a commit" help="restore the working directory to a commit" ) parser_checkout.add_argument( "-n", "--dry-run", action="store_true", default=False, help="print what will be transferred during a real checkout, " "but don't make any changes" ) # we write metavars as in git, # to distinguish them from rsync VAR. parser_checkout.add_argument( "commit", metavar="", help="commit name" ) parser_checkout.set_defaults(func=self._checkout) # clone # parser_clone = subparsers.add_parser( "clone", help="clone a local repository to remote or otherwise" ) parser_clone.add_argument( "name", metavar="", help="name of the clone", ) parser_clone.add_argument( "path", metavar="", help="path to the origin (from) " "or to the parent directory of the clone (to)" ) parser_clone.add_argument( "-f", "--force", action="store_true", help="ignore remote rsync-filter (only for pull)" ) # commit # parser_commit = subparsers.add_parser( "commit", help="commit the working directory" ) parser_commit.add_argument( "-m", "--message", metavar="", default="", help="a string with the commit message" ) parser_commit.add_argument( "--limit", metavar="", type=_check_positive, help="maximum number of commits" ) # diff # parser_diff = subparsers.add_parser( "diff", help="print the difference between two commits" ) parser_diff.add_argument( "commit", metavar="", help="commit name" ) parser_diff.add_argument( "other_commit", metavar="", nargs="?", default=None, help="other commit name" ) parser_diff.set_defaults(func=self._diff) # init # parser_init = subparsers.add_parser("init", help="initialize a repository") # add this option into the new release after improved testing. # parser_init.add_argument( # "--merge", action="store_true", help="merge existing repositories" # ) # reponame is used during commits parser_init.add_argument( "reponame", nargs="?", metavar="", help="name of the repository (for commits and logs)" ) # log # parser_log = subparsers.add_parser( "log", help="print commit logs" ) parser_log.add_argument( "-n", "--max-count", metavar="", type=int, default=-1, help="maximum number of logs shown" ) parser_log.add_argument("-r", "--reverse", action="store_true", help="reverse the order of the output") parser_log.set_defaults(func=self._log) # todo: log # pull # parser_pull = subparsers.add_parser( "pull", help="fetch data from source" ) # mutually exclusive arguments pull_group = parser_pull.add_mutually_exclusive_group() force_help = "remove commits and logs missing on source" pull_group.add_argument( "-f", "--force", action="store_true", help=force_help ) pull_group.add_argument( "--new", action="store_true", help="do not remove local data that is missing on source" ) pull_group.add_argument( "-b", "--backup", action="store_true", help="changed local files are renamed (not overwritten or ignored)" ) pull_group.add_argument( "--backup-dir", default="", metavar="DIR", help="changed local files are put into DIR preserving their paths" ) parser_pull.add_argument("source", metavar="", help="source name") # push # parser_push = subparsers.add_parser( "push", help="send data to a destination" ) parser_push.add_argument( "-f", "--force", action="store_true", help=force_help ) # we don't allow pushing new files to remote, # because that could cause its inconsistent state # (while locally we merge new files manually) parser_push.add_argument( "destination", metavar="", help="destination name" ) # common pull and push options for pparser in (parser_pull, parser_push): pparser.add_argument( "-n", "--dry-run", action="store_true", default=False, help="print what would be transferred during a real run, " "but do not make any change" ) # pparser.add_argument( # # not sure whether -o would be a good shortening # # (-o might go for options) # "--overwrite", action="store_true", # default=False, # help="propagate file changes" # ) # remote # parser_remote = subparsers.add_parser( "remote", help="manage remote repositories" ) # this is a different option from "yarsync -v" here. parser_remote.add_argument( "-v", "--verbose", action="store_true", # action="count", help="print repository paths" ) subparsers_remote = parser_remote.add_subparsers( title="remote commands", dest="remote_command", help="type 'yarsync remote --help' for additional help", metavar="", ) # parse_intermixed_args is missing in Python 2, # that's why we allow -v flag only after 'remote'. # # remote_parent_parser = argparse.ArgumentParser(add_help=False) # remote_parent_parser.add_argument( # "-v", "--verbose", action="count", # help="show remote paths. Insert after a remote command" # ) ## remote add parser_remote_add = subparsers_remote.add_parser( "add", help="add a remote" ) parser_remote_add.add_argument( "repository", metavar="", help="repository name", ) parser_remote_add.add_argument( "path", metavar="", help="repository path", ) # not used yet. # parser_remote_add.add_argument( # "options", nargs='*', help="repository options", # ) ## remote rm parser_remote_rm = subparsers_remote.add_parser( "rm", help="remove a remote" ) parser_remote_rm.add_argument( "repository", metavar="", help="repository name", ) for remote_subparser in [parser_remote_add, parser_remote_rm]: remote_subparser.set_defaults(func=self._remote) ## remote show, default parser_remote_show = subparsers_remote.add_parser( "show", help="print remotes" ) parser_remote_show.set_defaults(func=self._remote_show) # show # parser_show = subparsers.add_parser( "show", help="print log messages and actual changes for commit(s)" ) parser_show.add_argument( "commit", nargs="+", metavar="", help="commit name" ) parser_show.set_defaults(func=self._show) # status # parser_status = subparsers.add_parser( "status", help="print updates since last commit" ) parser_status.set_defaults(func=self._status) ##################### ## Parse arguments ## ##################### # basename, because ipython may print full path self.NAME = os.path.basename(argv[0]) # "yarsync" # directory with commits and other metadata # (may be updated by command line arguments) self.YSDIR = ".ys" if len(argv) > 1: # 0th argument is always present try: args = parser.parse_args(argv[1:]) except SystemExit as err: # argparse can raise SystemExit # in case of unrecognized arguments # (apart from ArgumentError and ArgumentTypeError; # hope this is the complete list) if err.code == 0: raise err raise YSUnrecognizedArgumentsError(err.code) else: if args.version: self._print_version() sys.exit(0) else: # default is print help. # Will raise SystemExit(0). args = parser.parse_args(["--help"]) ######################## ## Init configuration ## ######################## _ysdir = self.YSDIR root_dir = os.path.expanduser(args.root_dir) config_dir = os.path.expanduser(args.config_dir) if not root_dir and not config_dir: if args.command_name == "init": root_dir = "." config_dir = _ysdir else: # search the current directory and its parents try: root_dir = _get_root_directory(_ysdir) except OSError as err: # config dir not found. if args.command_name == "clone": # clone from remote here. # We don't allow cloning into an existing # repository, because one would need to # add a filter then; it is more complicated root_dir = "" else: _print_error( "fatal: no {} configuration directory {} found". format(self.NAME, _ysdir) + "\n Check that you are inside" " an existing repository" "\n or initialize a new repository" " with '{} init'.". format(self.NAME) ) raise err config_dir = os.path.join(root_dir, _ysdir) elif config_dir: if not root_dir: # If we are right in the root dir, # this argument should not be required. # But it is error prone if we move to a subdirectory # and call checkout (because root-dir will be wrong). # If the user wants safety, # they can provide the root-dir themselves # together with config-dir # (we say about an alias for 'yarsync --config-dir=...') root_dir = "." else: err_msg = "yarsync: error: --root-dir requires --config-dir "\ "to be provided" # we don't _print_error here, # because we want to mimic an argparse error. print(err_msg) # could not initialize ArgumentError here, # so created a new one raise YSArgumentError("root-dir", err_msg) self.root_dir = root_dir self.config_dir = config_dir # set technical attributes self._remote_config = None # directory creation mode could be set from: # - command line argument # - global configuration # - mode of the sync-ed directory (may be best) # - hardcoded # - just skipped (and will be set correctly by the OS). # self.DIRMODE = 0o755 self.CLONETOFILE = os.path.join(self.config_dir, "CLONE_TO_{}.txt") self.COMMITDIRNAME = "commits" self.COMMITDIR = os.path.join(self.config_dir, self.COMMITDIRNAME) self.CONFIGFILE = os.path.join(self.config_dir, "config.ini") self.DATEFMT = "%a, %d %b %Y %H:%M:%S %Z" self.HEADFILE = os.path.join(self.config_dir, "HEAD.txt") self.COMMITLIMITNAME = "COMMIT_LIMIT.txt" self.COMMITLIMITFILE = os.path.join(self.config_dir, self.COMMITLIMITNAME) self.LOGDIRNAME = "logs" self.LOGDIR = os.path.join(self.config_dir, self.LOGDIRNAME) self.MERGEFILE = os.path.join(self.config_dir, "MERGE.txt") # template for the repository name self.REPOFILE = os.path.join(self.config_dir, "repo_{}.txt") self.RSYNCFILTERNAME = "rsync-filter" self.RSYNCFILTER = os.path.join(self.config_dir, self.RSYNCFILTERNAME) # yarsync repositories are owned by one user. # However, different machines can have different user # and group ids, so we don't push extraneous ids there. # Used in pull and push (and indirectly in clone). self.RSYNCOPTIONS = ["-avH", "--no-owner", "--no-group"] self.SYNCDIRNAME = "sync" self.SYNCDIR = os.path.join(self.config_dir, self.SYNCDIRNAME) # SYNCSTR is defined in _Sync # self.SYNCFILE = os.path.join(self.config_dir, "sync.txt") ## Check for CONFIGFILE # "checkout", "diff", "init", "log", "show", "status" # work fine without config. if args.command_name in ["pull", "push", "remote"]: try: with open(self.CONFIGFILE, "r") as conf_file: config_text = conf_file.read() except OSError as err: if (args.command_name == "remote" and args.remote_command == "add" and not os.path.exists(self.CONFIGFILE)): config_text = "" else: # we are in an existing repository, # because .ys exists. _print_error( "fatal: could not read {} configuration at {}.". format(self.NAME, self.CONFIGFILE) + "\n Check your permissions or restore missing files " "with '{} init'". format(self.NAME) ) raise err try: config, configdict = self._read_config(config_text) except configparser.Error as err: err_descr = type(err).__name__ + ":\n " + str(err) _print_error( "{} configuration error in {}:\n ". format(self.NAME, self.CONFIGFILE) + err_descr ) raise YSConfigurationError(err, err_descr) self._configdict = configdict # Don't economize on memory here, but enhance our object # (better to store than to re-read). # if args.command_name == "remote": # # config is not needed for any other command self._config = config #################################### ## Initialize optional parameters ## #################################### ######################### ## Initialize commands ## ######################### # there is no easy way to set a default command # for a subparser, https://stackoverflow.com/a/46964652/952234 if args.command_name == "commit": self._func = functools.partial( self._commit, limit=args.limit, message=args.message ) elif args.command_name == "clone": if root_dir: # cloning to if args.force: err_msg = "yarsync: error: --force can be used only when cloning from" print(err_msg) raise YSArgumentError("force", err_msg) self._func = functools.partial( self._clone_to, remote=args.name, parent_path=args.path ) else: # cloning from self._func = functools.partial( self._clone_from, name=args.name, path=args.path, force=args.force ) elif args.command_name == "init": # https://stackoverflow.com/a/41070441/952234 # disable merge for release 0.2. self._func = functools.partial(self._init, args.reponame, merge=False) # self._func = functools.partial(self._init, args.reponame, # merge=args.merge) # this also works, but lambdas can't be pickled # (even though we don't need that) # self._func = lambda: self._init(self._args.reponame) elif args.command_name in ["pull", "push"]: if args.command_name == "pull": new = args.new backup_dir = args.backup_dir backup = args.backup or backup_dir remote = args.source else: new = False backup = False backup_dir = "" remote = args.destination self._func = functools.partial( # common options self._pull_push, args.command_name, remote, dry_run=args.dry_run, force=args.force, # overwrite=args.overwrite, # pull options new=new, backup=backup, backup_dir=backup_dir ) elif args.command_name == "remote" and args.remote_command is None: self._func = self._remote_show else: self._func = args.func if args.command_name == "pull": args._remote = args.source elif args.command_name == "push": args._remote = args.destination self._default_print_level = 2 self.print_level = (self._default_print_level - args.quiet + args.verbose) self._args = args def _clone_from(self, name, path, force=False): """Clone the repository from *path*. The new repository will be called *name* and have the origin as a remote. """ orig_path = path path = _substitute_env(path).getvalue() if path.endswith('/'): path = path[:-1] repo_dir_name = os.path.basename(path) # get remote configuration. Note the trailing slash remote_config_path = path + "/.ys/" # remote_config_path = os.path.join(path, self.YSDIR) try: remote_config = self._get_remote_config(remote_config_path) except OSError: _print_error("No yarsync repository found at {}".format(path)) return COMMAND_ERROR except YSConfigurationError as err: # the error is not printed in main, # because here we can customize it better _print_error("Could not get remote configuration. " + err.msg) return CONFIG_ERROR if not force and "rsync-filter" in remote_config._file_list: _print_error( "Remote configuration contains rsync-filter.\n " "Initialize a new repository and copy the filter manually,\n " "then add the remote and pull data from there." ) return CONFIG_ERROR remote_name = remote_config.repo_name if remote_name == name: # todo: it should be checked among all synced repos _print_error( "Name '{}' is already used by the remote. ".format(name) + "Aborting.\n Each replica name must be unique." ) return COMMAND_ERROR self._remote_config = remote_config # create a local directory if not os.path.exists(repo_dir_name): self._print_command("mkdir {}".format(repo_dir_name)) os.mkdir(repo_dir_name) else: _print_error( "Directory '{}' exists. Aborting.\n ".format(repo_dir_name) + "Are you cloning from the parent directory of '{}'?"\ .format(orig_path) ) return COMMAND_ERROR os.chdir(repo_dir_name) # initialize a new object # to set the working and configuration directories. # todo: create an initialization which would use # verbosity from this object (self). ys = YARsync(["yarsync", "-qq", "init", name]) # todo: rename reponame to name. ys._init(reponame=name) # add remote *remote_name* and pull data from there ys._remote_add(remote_name, orig_path) # todo: fix configuration update during remote_add. ys_pull = YARsync(["yarsync", "-qq", "pull", remote_name]) # possible exceptions will raise from _pull_push, # we don't do cleaning up in the local repository. # If the remote is not a repository, we shall know about it here returncode = ys_pull._pull_push("pull", remote_name) # configuration existence is checked in _get_remote_config # if returncode == CONFIG_ERROR: # _print_error("no yarsync repository found at {}".format(path)) # self._print("\nremoving remote {}".format(remote_name)) # ys._remote_rm(remote_name) # return COMMAND_ERROR if returncode: _print_error("An error occurred while pulling data from '{}'." .format(remote_name)) else: self._print("\ncloned from '{}'.".format(remote_name)) return returncode def _clone_to(self, remote, parent_path): """Clone this repository into *parent_path*. The full path to the new repository is *parent_path* joined with the directory name of the local repository. *remote* is the name of the cloned repository. It is added to the local configuration and into synchronization data. """ # 0. Check that remote directory doesn't exist repo_dir_name = os.path.basename(self.root_dir) eparent_path = _substitute_env(parent_path).getvalue() path = os.path.join(parent_path, repo_dir_name) try: # parent_path must exist and be readable # print rsync errors only when verbose rsync_pl = self._default_print_level + 2 remote_files = self._get_remote_files( eparent_path, print_level=rsync_pl ) except OSError: # hypothetically, we need only write access, # but it would be safer to check # the existence of the new path _print_error( "Parent folder of the clone could not be read. " "Aborting.\n Does the path '{}' exist?".format(eparent_path) ) return COMMAND_ERROR if repo_dir_name in remote_files: _print_error( "Repository folder already exists at {} . Aborting." .format(os.path.join(parent_path, repo_dir_name)) ) return COMMAND_ERROR # 1. Add remote # We don't check that remote is not in sync, # because we might want to clone same repo anew (hypothetically). # Important to call it before using self._configdict returncode = self._remote_add(remote, path) if returncode: # all errors were printed in _remote_add return returncode # add remote to the parsed configdict for push. # We could have made it an argument for _pull_push, # but it may use config for finer transfer options. self._configdict[remote] = {} self._configdict[remote]["destpath"] = _substitute_env(path).getvalue() # cache repo name before creating another file for that try: self._get_repo_name_local() except YSConfigurationError: self._remote_rm(remote) return CONFIG_ERROR # temporarily create a repo file to transfer it remote_repo_file = self._write_repo_name(remote, verbose=False) clone_to_file = self.CLONETOFILE.format(remote) self._print("# create a temporary file {}".format(clone_to_file), level=self._default_print_level) with open(clone_to_file, "x"): pass # todo: maybe optionally transfer config # (without the new remote) as well include_configs = [os.path.basename(remote_repo_file)] def remote_rm(): self._remote_rm(remote) try: sync = self._sync except AttributeError: # sync was not updated by push, # can be when we have uncommitted changes, etc. pass else: sync.remove_repo(remote) # todo: maybe should be a method of _Sync self._write_sync(sync) _print_error("\ncould not push data to {}".format(path)) # 2. push to try: # uncommitted changes, etc, will be checked here # todo: maybe clone and other arguments # should be available for testing from command line returncode = self._pull_push( "push", remote, clone=True, include_configs=include_configs, # force=True, ) except BaseException as e: # for idempotence in case of errors remote_rm() raise e finally: os.remove(remote_repo_file) os.remove(clone_to_file) if returncode: remote_rm() return returncode self._print("\ncloned to '{}'.".format(remote)) return returncode def _checkout(self, commit=None): """Checkout a commit. Warning: all changes in the working directory will be overwritten! """ # todo: do we allow a default (most recent) commit? # also think about ^, ^^, etc. # However, see no real usage for them. if commit is None: commit = int(self._args.commit) # todo: improve verbosity handling verbose = True if commit not in self._get_local_commits(): raise ValueError("commit {} not found".format(commit)) # copied from _status() commit_dir = os.path.join(self.COMMITDIR, str(commit)) command_begin = [ "rsync", "-au", # completely meaningless: "--no-inc-recursive" "--link-dest=.ys/commits/{}".format(commit), ] if self._args.dry_run: command_begin += ["-n"] command_begin.extend(["--delete", "-i", "--exclude=/.ys"]) filter_command = self._get_filter(include_commits=False) command = command_begin + filter_command # outbuf option added in Rsync 3.1.0 (28 Sep 2013) # https://download.samba.org/pub/rsync/NEWS#ENHANCEMENTS-3.1.0 # from https://stackoverflow.com/a/35775429 command.append('--outbuf=L') command += [commit_dir + '/', self.root_dir] if verbose: self._print_command(command) sp = subprocess.run(command) else: sp = subprocess.run(command, stdout=subprocess.PIPE) # we don't check for error code here, # because if checkout was wrong, we can't be sure # in the resulting state. if commit == self._get_last_commit(): # remove HEADFILE self._update_head() else: # write HEADFILE with open(self.HEADFILE, "w") as head_file: print(commit, file=head_file) return sp.returncode def _commit(self, limit=None, message=""): """Commit the working directory and create a log. Commit name is based on UNIX time. If there are more commits than *limit*, older commits and logs will be removed. """ try: reponame = self._get_repo_name_local() except YSConfigurationError: return CONFIG_ERROR username = getpass.getuser() time_str = time.strftime(self.DATEFMT, time.localtime()) log_str = "When: {date}\nWhere: {user}@{repo}".format( date=time_str, user=username, repo=reponame ) if message: message += "\n\n" if os.path.exists(self.MERGEFILE): # copied from _status with open(self.MERGEFILE, "r") as fil: merge_str = fil.readlines()[0].strip() merges = merge_str.split(',') message += "Merge {} and {} (common commit {})\n"\ .format(*merges) if limit is not None: message += "Setting commit limit to {}.\n".format(limit) message += log_str if not os.path.exists(self.COMMITDIR): self._print_command("mkdir {}".format(self.COMMITDIR)) os.mkdir(self.COMMITDIR) commit_name = str(int(time.time())) commit_dir = os.path.join(self.COMMITDIR, commit_name) commit_dir_tmp = commit_dir + "_tmp" # Raise if this commit exists # We don't want rsync to write twice to one commit # even though it's hard to imagine how this could be possible # (probably broken clock?) # todo: improve concurrency. if os.path.exists(commit_dir): raise RuntimeError("commit {} exists".format(commit_dir)) if os.path.exists(commit_dir_tmp): raise RuntimeError( "temporary commit {} exists".format(commit_dir_tmp) ) # exclude .ys, otherwise an empty .ys/ will appear in the commit command = ["rsync", "-a", "--link-dest=../../..", "--exclude=/.ys"] filter_list = self._get_filter(include_commits=False) command.extend(filter_list) # the trailing slash is very important for rsync # on Windows the separator is the same for rsync. # https://stackoverflow.com/a/59987187/952234 # However, this may or may not work in cygwin # https://stackoverflow.com/a/18797771/952234 root_dir = self.root_dir + '/' command.extend([root_dir, commit_dir_tmp]) self._print_command(command) if self.print_level >= 3: # with run there will be problems during testing completed_process = subprocess.Popen(command) else: completed_process = subprocess.Popen( command, stdout=subprocess.DEVNULL ) completed_process.communicate() returncode = completed_process.returncode if returncode: # if the run was not verbose enough, we won't see stdout. # Make a more verbose commit then. _print_error("an error occurred during hard linking, " "rsync returned {}".format(returncode)) return returncode # commit is done self._print_command("mv {} {}".format(commit_dir_tmp, commit_dir), level=3) os.rename(commit_dir_tmp, commit_dir) ## log ## if not os.path.exists(self.LOGDIR): self._print_command("mkdir {}".format(self.LOGDIR)) os.mkdir(self.LOGDIR) commit_log_name = os.path.join(self.LOGDIR, commit_name + ".txt") # write log file with open(commit_log_name, "w") as commit_file: print(message, file=commit_file) # print to stdout self._print( "commit {} created\n\n".format(commit_name), message, sep="" ) try: # merge is done, if that was active os.remove(self.MERGEFILE) except FileNotFoundError: pass # if we were not at HEAD, move that now self._update_head() if limit is None: repo_commit_limit = self._get_commit_limit() if repo_commit_limit is None: return 0 limit = repo_commit_limit cl_from_file = True else: cl_from_file = False ## limit commits commits = sorted(self._get_local_commits()) ncommits = len(commits) if ncommits > limit: delete_commits = commits[:ncommits - limit] for comm in delete_commits: comm_path = os.path.join(self.COMMITDIR, str(comm)) log_path = os.path.join(self.LOGDIR, str(comm) + ".txt") self._print("removing commit {}".format(comm)) shutil.rmtree(comm_path) try: os.remove(log_path) except FileNotFoundError: pass self._print("removed older commits with logs") # make commit limit persistent if not cl_from_file: with open(self.COMMITLIMITFILE, "w") as fil: fil.write(str(limit)) return 0 def _diff(self, commit1=None, commit2=None, verbose=True): # arguments are positional only """Print the difference between *commit1* and *commit2* (from the old to the new one). """ if commit1 is None: commit1 = int(self._args.commit) if commit2 is None: commit2 = self._args.other_commit if commit2 is None: commit2 = self._get_last_commit() else: commit2 = int(commit2) comm1 = min(commit1, commit2) comm2 = max(commit1, commit2) comm1_dir = os.path.join(self.COMMITDIR, str(comm1)) comm2_dir = os.path.join(self.COMMITDIR, str(comm2)) if not os.path.exists(comm1_dir): raise ValueError("commit {} does not exist".format(comm1)) if not os.path.exists(comm2_dir): raise ValueError("commit {} does not exist".format(comm2)) command = [ "rsync", "-aun", # useless now, see comment in _status() # "--no-inc-recursive", "--delete", "-i", ] # outbuf option added in Rsync 3.1.0 (28 Sep 2013) # https://download.samba.org/pub/rsync/NEWS#ENHANCEMENTS-3.1.0 # from https://stackoverflow.com/a/35775429 command.append('--outbuf=L') # what changes should be applied for comm1 to become comm2 # / is extremely important! command += [comm2_dir + '/', comm1_dir] if verbose: self._print_command(command) sp = subprocess.Popen(command, stdout=subprocess.PIPE) for line in iter(sp.stdout.readline, b''): print(line.decode("utf-8"), end='') return sp.returncode def _get_commit_limit(self): try: with open(self.COMMITLIMITFILE) as fil: cl_content = fil.readline() except FileNotFoundError: return None try: commit_limit = _check_positive(cl_content) except argparse.ArgumentTypeError: raise YSConfigurationError( msg="commit limit must be a natural number. " "{} contains {}".format(self.COMMITLIMITFILE, cl_content) ) return commit_limit def _get_dest_path(self, dest=None): """Return a pair *(host, destpath)*, where *host* is a real host (its ip/name/etc.) at the destination and *destpath* is a path on that host. If *host* is not present in the configuration, it is set to localhost. If destination path is missing, `KeyError` is raised. """ config = self._configdict # if dest: try: dest_section = config[dest] except KeyError: raise KeyError( "no destination '{}' found in the configuration {}. ". format(dest, self.CONFIGFILE) ) # todo: # else: # # find a section with a key "upstream". # # It could be also "upstream-push" or "upstream-pull", # # but this looks too remote for now. # try: # dest = config["default"]["name"] # except KeyError: # raise KeyError( # 'no default destination found in config. ' # ) # destpath must be present (checked during _read_config). destpath = dest_section["destpath"] if not destpath.endswith('/'): destpath += '/' return destpath def _get_filter(self, path="", include_commits=True, include_configs=()): """Make rsync filters to be used during synchronization. If *path* is non-empty, filters are relative to that path. """ # todo: path is probably not needed and not fully implemented # rename to _make_filter. # rename include_commits to include_what? if path: rsync_filter = os.path.join(path, self.YSDIR, self.RSYNCFILTERNAME) else: rsync_filter = self.RSYNCFILTER if os.path.exists(rsync_filter): # for merge filter rsync requires a full path, # while for include/exclude only relative ones filter_ = ["--filter=merge {}".format(rsync_filter)] else: filter_ = [] include_filters = [] if include_commits: includes = [ "/".join([self.YSDIR, self.COMMITDIRNAME]), "/".join([self.YSDIR, self.LOGDIRNAME]), "/".join([self.YSDIR, self.SYNCDIRNAME]), # "/.ys/logs" ] include_filters = ["--include={}".format(inc) for inc in includes] # since we don't have spaces in the command, # single ticks are not necessary for config in include_configs: # reponame for clone inc_config = "/".join([self.YSDIR, config]) include_filters.append("--include={}".format(inc_config)) filter_.extend(include_filters) # exclude could go before or after include, # but the first matching rule is applied. # It's important to place /* after .ys, # because it means files exactly one level below .ys filter_ += ["--exclude=/.ys/*"] return filter_ def _get_head_commit(self): try: with open(self.HEADFILE, "r") as head: # strip trailing newline head_commit = head.readlines()[0].strip() except OSError: # no HEADFILE means HEAD is the most recent commit return None return int(head_commit) def _get_last_commit(self, commits=None): # todo: cache the last commit (or all commits) # but: pull can update that! if commits is None: commits = self._get_local_commits() if not commits: return None return max(commits) def _get_local_commits(self): """Return local commits as an iterable of integers.""" # todo: cache results. But note that it is used in pull. try: commit_candidates = os.listdir(self.COMMITDIR) except OSError: # no commits exist # todo: do we print about that here? commit_candidates = [] return list(map(int, filter(_is_commit, commit_candidates))) def _get_local_sync(self, syncdata=None, verbose=True): """Get local synchronization information.""" if syncdata is None: try: syncdata = os.listdir(self.SYNCDIR) except FileNotFoundError: # sybtype of OSError syncdata = [] # this is not an error # verbose is False for automatic usage. if verbose: self._print("No synchronization directory found.") # parse synchronization data sync = _Sync(syncdata) if not sync and verbose: self._print("No synchronization information found.") return sync def _get_remote_config(self, config_path, print_level=3): """Return remote configuration as _Config.""" # try cached value if self._remote_config: return self._remote_config try: remote_files = self._get_remote_files( config_path, with_commits=True, print_level=print_level ) except OSError as err: # we don't push into a non-existing repository raise err # # detailed error messages are already printed by rsync # raise OSError( # "error while listing remote commits" # ) # return {"commits": [], "sync": _Sync([])} # can raise YSConfigurationError remote_config = _Config(remote_files) # this is a getter. We set self._remote_config # elsewhere if needed. return remote_config def _get_remote_files(self, path, with_commits=False, print_level=3): """Return a list of files at the remote path. Path can be one file (why though). The result does not contain '.' and '..'. Remote can be local. """ # we want to list directory contents, not just their names if not path.endswith('/'): path += '/' command = ["rsync", "--list-only"] if with_commits: # list commits, but not their contents command.extend(["-r", "--exclude=/*/*/*", "--exclude=logs/"]) command.append(path) # no idea what from_path was in that case. # command = "rsync -nr --info=NAME --include=/ --exclude=/*/*".split() \ # + [from_path, to_path] self._print_command(" ".join(command), level=print_level) if print_level <= self.print_level + 1: # at print level 3 we print errors, but not the commands stderr = None # all errors printed else: stderr = subprocess.DEVNULL sp = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=stderr) # todo: can we iterate stdout before wait is finished? # It is not critical, because we list not the complete hierarchy sp.wait() returncode = sp.returncode if returncode: raise OSError( "error during listing remote files: rsync returned {}"\ .format(returncode) ) files = {} for line in iter(sp.stdout.readline, b''): # print(line, line[0]) # probably files with a space won't work here. fil = line.split()[-1] if fil in [b'.', b'..']: continue # make them strings for easier use # and coherent with os.listdir path = fil.decode("utf-8") # example: commits/1579013756 parts = path.split('/') # or pathlib.PurePath(path).parts if len(parts) == 1: if line[0] == 100: # b'd' # a directory dir_ = parts[0] if dir_ not in files: files[dir_] = [] else: # a file in .ys files[path] = None else: dir_ = parts[0] subpath = "/".join(parts[1:]) if dir_ in files: files[dir_].append(subpath) else: files[dir_] = [subpath] return files def _get_repo_name_local(self): # cache this value, because in clone we create a temporary one # and wouldn't be able to have two files simultaneously if hasattr(self, "_reponame"): return self._reponame reponame = _get_repo_name_if_exists(config_dir=self.config_dir) # todo: reponame must exist. if reponame is None: err_msg = ("Could not find repository name. " "Provide one with init.") _print_error(err_msg) raise YSConfigurationError(msg=err_msg) # platform.node() just calls socket.gethostname() # with an error check reponame = socket.gethostname() self._reponame = reponame return reponame def _init(self, reponame="", merge=False): """Initialize default configuration. Create configuration folder, configuration and repository files. If a configuration file already exists, it is not changed. This operation is safe and idempotent (can be repeated safely). *reponame* will be written to self.REPOFILE and used during commits. If *merge* is ``True``, the repository comprises several existing ones. This can be used to rearrange them without re-sending present remote files. """ """ How to merge: 1) Prepare the merge. a) synchronize all needed repositories (bring them to the same state). This is not needed (see comment about commits), but will make things easier. b) Check that you have not too many files (hard links), because they will triple during the merge. Consider removing some old commits (and sync again). But don't be overly cautious: for hundreds of thousands of files this worked fine for the author. c') Move non-merging repositories out of the directory with merging ones. This might be safer, but is unnecessary unless you care about more hard links. c'') Alternatively, move all merging repositories to a new directory. Since you've already synchronized them, preserving their remote paths is not needed. 3) (actually now 2) Init and commit. [source] yarsync init --merge [source] yarsync commit -m "Merging. Initialize." 4) Check that the repositories and filters are correct. 5) create a remote merging repository (--merge is needed, otherwise remote status will show files in subdir/.ys/ . Probably won't affect actual transfers, because all filters will act on the sending side). [dest] yarsync init --merge Remove new rsync filters on dest, but it is not needed (we assume that the destination has no filters!) 6) Push commits to destination # test, as ever [source] yarsync push -n [source] yarsync push 7) now they are synced. Reorganize data (tip: just move gross directories to corresponding repositories, finely rearrange them any time later). Can remove the merging subrepository. It is saved in commits. [source] # yarsync status [source] yarsync commit -m "Merge done." May remove the merging repository from .ys/rsync-filter . 7') (optional) make commits in the resulting subrepositories. [source/repo] yarsync commit -m "Merged that and that data from ." Probably don't do that, or deal with rsync filters in their roots. 8) push data to its destination. [source] # yarsync push -n [source] yarsync push If you removed the repository locally, but didn't wipe it from the filter, rsync will refuse to delete its remote configuration, because it is still protected by filter rules. You can remove it manually on the destination. [dest] # rm -rf 9) Finish merge. Move all repositories to their initial directories. 9') If you didn't commit during 7', commit changes to local repositories and push them to the destination. This should be really quick, because all files are already there. This could be done after 10), but is a bit safer before that. 10) Remove the merging repository. Check that the current path is correct before that! [local] rm -rf .ys [remote] rm -rf .ys """ init_repo_str = "Initialize configuration" if reponame: init_repo_str += " for '{}'".format(reponame) self._print(init_repo_str, level=2) # create config_dir ysdir = self.config_dir if not os.path.exists(ysdir): self._print_command("mkdir {}".format(ysdir)) # self._print_command("mkdir -m {:o} {}". # format(self.DIRMODE, ysdir)) # can raise "! [Errno 13] Permission denied: '.ys'" os.mkdir(ysdir) # if every configuration file existed, # new_config will be False new_config = True else: self._print("{} already exists, skip".format(ysdir), level=self._default_print_level) new_config = False # create self.CONFIGFILE if not os.path.exists(self.CONFIGFILE): self._print("# create configuration file {}".format(self.CONFIGFILE), level=self._default_print_level) with open(self.CONFIGFILE, "w") as fil: print(CONFIG_EXAMPLE, end="", file=fil) new_config = True else: self._print("{} already exists, skip".format(self.CONFIGFILE), level=self._default_print_level) # create repofile cur_reponame = _get_repo_name_if_exists(config_dir=self.config_dir) if not cur_reponame: if not reponame: hostname = socket.gethostname() reponame = input("Enter repository name [{}]:" .format(hostname)) if not reponame: # default, no entry reponame = hostname self._write_repo_name(reponame) new_config = True else: if reponame and cur_reponame != reponame: _print_error( "a different repository name {} already exists. Aborting" .format(cur_reponame) ) return COMMAND_ERROR self._print("{} already exists, skip".format(cur_reponame), level=self._default_print_level) # completely untested if merge: rsync_filter = "rsync-filter" dirs = os.listdir('.') # it is recommended to merge existing repositories # (not just any directories), # but we don't check it here. filter_strs = ["# created by 'yarsync init --merge'"] for dir_ in dirs: if not os.path.exists(os.path.join(dir_, ".ys")): # not yarsync repositories continue if not os.path.isdir(dir_): # simple files continue if dir_ == ysdir: # we are already in a merging repository, # and init is idempotent. continue # transfer commits and logs. # This allows to sync the resulting # repositories simultaneously. # If one wants have fewer hard links, # they should remove these include lines manually. filter_strs.append("+ /" + dir_ + "/.ys/commits") filter_strs.append("+ /" + dir_ + "/.ys/logs") ys_filter = os.path.join(dir_, ".ys", rsync_filter) if os.path.exists(ys_filter): # copy rsync filters, so that they have effect # on the repository (not only inside .ys directory). filter_copy = os.path.join(dir_, rsync_filter) # check that we don't destroy existing files. if os.path.exists(filter_copy): if (os.stat(filter_copy).st_ino != os.stat(ys_filter).st_ino): # st_ino is platform dependent, # but since we use commits # and we are on Linux, # it should always work (for Windows too). # https://docs.python.org/3/library/os.html#os.stat_result _print_error( filter_copy + " exists. " "Can't link existing filter {}/.ys/rsync-filter." "\n Remove or rename that file.".format(dir_) ) raise YSCommandError() else: os.link(ys_filter, filter_copy) # Filters for different repositories # must be independent, # therefore they are "per-directory" # (single-instance ones # are simply incorporated into the filter). # Possible problems: # - stray rsync-filters (thouse outside .ys, # that would normally have no effect). # -- seems a larger path works fine. # - excluding/including more than needed. # -- surprisingly, various/repos # was correctly created. filter_strs.append(": " + dir_ + "/.ys/rsync-filter") filter_strs.append("- " + filter_copy) # don't use a slash before dir_, # otherwise it will search in upper directories. # not transfer repository configuration. # /* is very important at the end, # because with /.ys it will not consider anything there # (includes discarded). filter_strs.append("- /" + dir_ + "/.ys/*") # --merge overwrites this file every init. # if os.path.exists(self.RSYNCFILTER): with open(self.RSYNCFILTER, 'w') as fil: for str_ in filter_strs: print(str_, file=fil) new_config = True self._print("# Created configuration file {}".format(self.RSYNCFILTER), level=self._default_print_level) ysdir_fp = os.path.realpath(ysdir) if new_config: self._print("\nInitialized yarsync configuration in {} " .format(ysdir_fp)) else: self._print("\nConfiguration in {} already initialized." .format(ysdir_fp)) return 0 def _make_commit_list(self, commits=None, logs=None): """Make a list of *(commit, commit_log)* for all logs and commits. *commits* and *logs* are sorted lists of integers. If a log is missing for a given commit, or a commit is missing for a log, the result contains ``None``. """ # commits and logs in the interface # are only for testing purposes def get_sorted_logs_int(files, commits=None): # discard '.log' extension log_names = (fil[:-4] for fil in files) sorted_logs = sorted(map(int, filter(_is_commit, log_names))) if commits is None: return sorted_logs else: # if commits are set explicitly, # return logs only for those commits return [log for log in sorted_logs if log in commits] if logs is None: try: log_files = os.listdir(self.LOGDIR) except OSError: # no log directory exists log_files = [] logs = get_sorted_logs_int(log_files, commits) if commits is None: commits = sorted(self._get_local_commits()) else: commits = sorted(commits) # note that we don't check whether these commits # actually exist. This function logic doesn't require that. # todo: allow commits in the defined order. # that would require first yielding all commits, # then all logs without commits. Looks good. # And much simpler. But will that be a good log?.. if not commits and not logs: return [] results = [] commit_ind = 0 commits_len = len(commits) log_ind = 0 logs_len = len(logs) commit = None log = None while True: logs_finished = (log_ind > logs_len - 1) commits_finished = (commit_ind > commits_len - 1) if not commits_finished: commit = commits[commit_ind] if not logs_finished: log = logs[log_ind] if commits_finished and logs_finished: break elif logs_finished: results.append((commit, None)) commit_ind += 1 continue elif commits_finished: results.append((None, log)) log_ind += 1 continue # print(commit_ind, log_ind) # both commits and logs are present if commit == log: results.append((commit, log)) commit_ind += 1 log_ind += 1 elif commit < log: results.append((commit, None)) commit_ind += 1 else: results.append((None, log)) log_ind += 1 return results def _log(self): """Print commits and log information. If only a commit or only a log are present, they are listed as well. By default most recent commits are printed first. Set *reverse* to ``False`` to print last commits last. """ reverse = self._args.reverse max_count = self._args.max_count cl_list = self._make_commit_list() if not reverse: commit_log_list = list(reversed(cl_list)) else: commit_log_list = cl_list if max_count != -1: # otherwise the last element is excluded commit_log_list = commit_log_list[:max_count] sync = self._get_local_sync(verbose=True) head_commit = self._get_head_commit() try: local_repo = self._get_repo_name_local() except YSConfigurationError: return CONFIG_ERROR def print_logs(commit_log_list): for ind, (commit, log) in enumerate(commit_log_list): if ind: print() self._print_log( commit, log, local_repo=local_repo, sync=sync, head_commit=head_commit ) print_logs(commit_log_list) if not commit_log_list: self._print("No commits found") return 0 def _print(self, *args, level=None, **kwargs): """Print output messages.""" if level is None: # when we don't supply a print level, # these calls are usually important level = self._default_print_level - 1 if level > self.print_level: return if level > self._default_print_level: print("# ", end='') print(*args, **kwargs) def _print_command(self, command, level=None): """Print called commands.""" # A separate function to semantically distinguish that # from _print in code. # However, _print is used internally - to handle output levels. def command_str(command): for comm in command: if ' ' in comm: # can be present for # --filter='merge test_dir_filter/.ys/rsync-filter' # Alternatively, one can put '' to the right of '=' yield "'{}'".format(comm) else: yield comm if isinstance(command, str): self._print(command, level=level) else: # list self._print(" ".join(command_str(command)), level=level) def _print_log(self, commit, log, local_repo, sync, head_commit=None): if commit is None: commit_str = "commit {} is missing".format(log) commit = log else: commit_str = "commit " + str(commit) if commit == head_commit: commit_str += " (HEAD)" if commit in sync.by_repos.values(): other_repos = sync.get_synced_repos_for( commit, exclude_repo=local_repo ) remote_str = ", ".join(other_repos) commit_str += " <-> {}".format(remote_str) if log is None: log_str = "Log is missing" # time.time is timezone independent. # Therefore localtime is the local time # corresponding to that universal time. # Commit could be made in any time zone. commit_time_str = time.strftime(self.DATEFMT, time.localtime(commit)) log_str += "\nWhen: {}".format(commit_time_str) + '\n' else: log_file = open(os.path.join(self.LOGDIR, str(log) + ".txt")) # read returns a redundant newline log_str = log_file.read() # print("log_str: '{}'".format(log_str)) # hard to imagine a "quiet log", but still. self._print(commit_str, log_str, sep='\n', end='') # print(commit_str, log_str, sep='\n', end='') def _print_version(self): print(self.NAME, "version", __version__) # todo: print rsync version and whether it supports hard links def _pull_push( self, command_name, remote, dry_run=False, force=False, new=False, overwrite=False, clone=False, include_configs=(), backup=False, backup_dir="" ): """Push/pull commits to/from destination or source. By default, several checks are made to prevent corruption: - source has no uncommitted changes, - source has not a detached HEAD, - source is not in a merging state, - destination has no commits missing on source. Note that the destination might have uncommitted changes: check that with *-n* (*--dry-run*) first! *backup*, *backup_dir* and *new* only apply to pull. *overwrite* is temporarily disabled until rsync fixes. """ if self._get_head_commit() is not None: # it could be safe to push a repo with a detached HEAD, # but that would be messy. # OSError is for exceptions # that can occur outside the Python system raise OSError("local repository has detached HEAD.\n" "*checkout* the most recent commit first.") if os.path.exists(self.MERGEFILE): raise OSError( "local repository has unmerged changes.\n" "Manually update the working directory and *commit*." ) # otherwise will be called in _status below try: local_repo = self._get_repo_name_local() except YSConfigurationError: # the error is printed return CONFIG_ERROR if not (new or force): returncode, changed = self._status(check_changed=True) if changed: _print_error( "local repository has uncommitted changes. Exit.\n " "Run '{} status' for more details.".format(self.NAME) ) return COMMAND_ERROR if returncode: _print_error( "could not check for uncommitted changes, " "rsync returned {}. Exit\n ".format(returncode) + "Run '{} status' for more details.".format(self.NAME) ) return returncode # COMMAND_ERROR try: full_destpath = self._get_dest_path(remote) except KeyError as err: raise err from None # --link-dest is not needed, since if a file is new, # it won't be in remote commits. # -H preserves hard links in one set of files (but see the note in todo.txt). command = ["rsync"] command.extend(self.RSYNCOPTIONS) # Don't print progress by default, # because it clutters output for new commits. # (it will create an additional line for each file # and will require extra work to get rid of it). if self.print_level >= 3: command.append("-P") if dry_run: command.append("-n") command.append("--no-inc-recursive") if not new: command.append("--delete") if backup: if backup_dir: # create a full hierarchy in the backup_dir command.extend(["--backup-dir", backup_dir]) # --backup is implied during --backup-dir # only since this pull request in 2020 # https://github.com/WayneD/rsync/pull/35 # write new files near originals command.append("--backup") # allow after a fix of https://github.com/WayneD/rsync/issues/357 # elif not overwrite: # command.append("--ignore-existing") # command_str += " --ignore-existing" # we don't include commits (filter them in) # only if we do backups include_commits = not backup filter_ = self._get_filter( include_commits=include_commits, include_configs=include_configs ) command.extend(filter_) root_path = self.root_dir + "/" if command_name == "push": command.extend([root_path, full_destpath]) else: # pull command.extend([full_destpath, root_path]) # old local commits (before possible pull) local_commits = list(self._get_local_commits()) local_sync = self._get_local_sync(verbose=True) # get remote configuration. Note the trailing slash remote_config_dir = os.path.join(full_destpath, ".ys/") if clone and command_name == "push": remote_config = _Config({}, allow_empty=True) else: try: remote_config = self._get_remote_config( remote_config_dir, # don't complain about errors print_level=self._default_print_level+2 ) except OSError: _print_error("remote contains no yarsync repository") return CONFIG_ERROR except YSConfigurationError as err: _print_error("could not read remote configuration. " + err.msg) return CONFIG_ERROR remote_commits = remote_config.commits remote_sync = remote_config.sync # missing_commits can be overwritten by pull or push if command_name == "push": source_commits = local_commits dest_commits = remote_commits else: # pull source_commits = remote_commits dest_commits = local_commits rmcomm = set(remote_commits) missing_commits = [comm for comm in local_commits if comm not in rmcomm] # use a set to economize testing membership in a list, # https://stackoverflow.com/a/3462202/952234 # Can move into the comprehension. Not used anywhere else. # https://docs.python.org/3/reference/simple_stmts.html#grammar-token-python-grammar-target_list _source_commits = set(source_commits) missing_commits = [comm for comm in dest_commits if comm not in _source_commits] commit_limit = self._get_commit_limit() if not (force or new or commit_limit is not None) and missing_commits: missing_commits_str = ", ".join(map(str, missing_commits)) raise OSError( "\ndestination has commits missing on source: {}, "\ .format(missing_commits_str) + "synchronize these commits first:\n" "1) pull missing commits with 'pull --new',\n" "2) push if these commits were successfully merged, or\n" "2') optionally checkout,\n" "3') manually update the working directory " "to the desired state, commit and push, or\n" "1') pull/push --force the desired state " "(removing all commits and logs missing on the destination)." ) if self.print_level >= 3: stdout = None elif self.print_level == 2: stdout = subprocess.PIPE else: stdout = subprocess.DEVNULL # push synchronization information to the remote if command_name == "push" and not new and not dry_run and not force: # forbid --new sync update, # because it messes all sync together. # Obsolete local sync will be removed. local_sync.update(remote_sync.by_repos.items()) last_commit = self._get_last_commit() # todo: get remote name from remote .ys/repo_ # forbid several files with such name local_sync.update([ (local_repo, last_commit), (remote, last_commit) ]) try: self._write_sync(local_sync) except OSError as err: _print_error("could not log synchronization to {}. Abort." .format(self.SYNCDIR)) raise err # object attribute to reverse sync easier self._sync = local_sync # ---------------------------------------------------------- # Run self._print_command(command, level=3) completed_process = subprocess.Popen(command, stdout=stdout) # ---------------------------------------------------------- _ysdir = self.YSDIR if self.print_level == 2: # if we transfer a whole commit, merge all its output into one line. # Print transfers only for the working directory and existing commits. commits_to_transfer = set(source_commits) - set(dest_commits) transferred_commits = set() for line in iter(completed_process.stdout.readline, b''): # iteration copied from https://stackoverflow.com/a/1606870/952234 # Not self.COMMITDIR, because it involves the complete path. COMMITDIR = os.path.join(_ysdir, "commits") if line.startswith(bytes(COMMITDIR, "utf-8")): # commits com_start = len(COMMITDIR) + 1 # or os.sep com_end = line.find(b'/', com_start) com_str = line[com_start:com_end] if not com_str: # ".ys/commits/" print(line.decode("utf-8"), end='') continue cur_commit = int(com_str) if cur_commit in commits_to_transfer: # don't print transfers for complete new commits. transferred_commits.add(cur_commit) else: # print changes for existing commits print(line.decode("utf-8"), end='') else: # working directory print(line.decode("utf-8"), end='') # there can be also lines like # file => .ys/commits/.../file # leave them as they are. # # actually, this may be only part of the data # (if there is no space left) print() # "data transferred for commits:") for comm in sorted(transferred_commits): print("commit", comm) # need to wait even if stdout was exhausted completed_process.wait() returncode = completed_process.returncode if returncode: _print_error( "an error occurred, rsync returned {}. Exit". format(returncode) ) return returncode not_all_commits_exist = not local_commits or not remote_commits if not_all_commits_exist: if not local_commits: self._print("local commits missing") if not remote_commits and not clone: self._print("remote commits missing") if new: self._print("run {} without --new to fully synchronize " "repositories".format(command_name)) elif new: last_remote_comm = max(remote_commits) if last_remote_comm in local_commits: # remote commits are within locals # (except some old ones). # Automatic checkout is forbidden, # because it can delete files in the working directory # despite --new . Examples: uncommitted files, # interrupted (incomplete) commits. # self._checkout(max(local_commits)) self._print( "\nRemote commits can be automatically merged.\n" "Check the working directory first with\n" " yarsync status\n" "and commit or check out most recent commit:\n" " yarsync checkout {}".format(max(local_commits)) ) else: # remote commits diverged, need to merge them manually common_commits = set(local_commits)\ .intersection(remote_commits) if common_commits: common_comm = max(common_commits) else: common_comm = "missing" merge_str = "{},{},{}".format(max(local_commits), last_remote_comm, common_comm) # todo: check that it is taken into account in other places! if not dry_run: try: with open(self.MERGEFILE, "w") as fil: print(merge_str, end="", file=fil) except OSError: _print_error( "could not create a merge file {}, ". format(self.MERGEFILE) + "create that manually with " + merge_str ) raise OSError from None self._print( "merge {} and {} manually and commit " "(most recent common commit is {})". format(max(local_commits), last_remote_comm, common_comm) ) # update synchronization information locally if command_name == "pull" and not new and not dry_run and not force: # we update "remote" sync, because it will be moved here # and we need to update it with the local information remote_sync.update(local_sync.by_repos.items()) # last_commit is calculated including the pulled ones last_commit = self._get_last_commit() # see todo for push remote_sync.update([ (local_repo, last_commit), (remote, last_commit) ]) try: self._write_sync(remote_sync) except OSError as err: _print_error(err.strerror) _print_error("data transferred, but could not " "log synchronization to " + self.SYNCDIR) if not new and not dry_run: # --new means we've not fully synchronized yet. # either HEAD was correct ("not detached") (for push) # or it was updated (by pull) self._update_head() return 0 def _read_config(self, config_text): # substitute environmental variables (those that are available) # todo: what if an envvar is not present for the current section? subst_lines = _substitute_env(config_text).getvalue() # no value is allowed # for a configuration key "host_from_section_name" config = configparser.ConfigParser(allow_no_value=True) config.read_string(subst_lines) # todo !!: only one of config or configdict must be used! # configdict is config with some evaluations, # like full paths. configdict = {} for section in config.sections(): sectiond = dict(config[section]) configdict[section] = sectiond if section == config.default_section: continue try: host = sectiond["host"] except KeyError: if "host_from_section_name" in config[config.default_section]: # sections for remotes are named after their hosts host = section else: host = "" try: path = sectiond["path"] except KeyError as err: err_descr = "a required key 'path' is missing. "\ "Provide the path to the remote '{}'.".\ format(section) _print_error( "{} configuration error in {}:\n ". format(self.NAME, self.CONFIGFILE) + err_descr ) raise YSConfigurationError(err, err_descr) # If host is empty, then this is a local host # or it is already present in the path. # If host is non-empty, that can't be present in the path. sectiond["destpath"] = _mkhostpath(host, path) # print all values: # formatter = lambda s: json.dumps(s, sort_keys=True, indent=4) # print(formatter(configdict)) # config.items() includes the DEFAULT section, which can't be removed. return (config, configdict) def _remote(self): """Manage remotes.""" # Since self._func() is called without arguments, # this is the place for all remote-related operations. if self._args.remote_command == "add": repository = self._args.repository path = self._args.path # options = self._args.options return self._remote_add(repository, path) # return self._remote_add(repository, path, options) elif self._args.remote_command == "rm": return self._remote_rm(self._args.repository) def _remote_add(self, remote, path, options=""): """Add a remote and its path to the config file.""" # from https://docs.python.org/2.7/library/configparser.html#examples if not hasattr(self, "_config"): # config might be missing if we first call 'init' # and then '_remote_add' (as in 'clone'). # In that case it is not necessary to check for all errors. with open(self.CONFIGFILE, "r") as conf_file: config_text = conf_file.read() self._config, self._configdict = self._read_config(config_text) config = self._config try: config.add_section(remote) except configparser.DuplicateSectionError: _print_error( "remote {} exists, break.\n Remove {} " "or choose a new remote name.".format(remote, remote) ) return COMMAND_ERROR config.set(remote, "path", path) # todo: options not implemented if options: config.set(remote, "options", options) with open(self.CONFIGFILE, "w") as configfile: config.write(configfile) self._print("Remote '{}' added.".format(remote)) return 0 def _remote_rm(self, remote): """Remove a *remote*.""" config = self._config try: del config[remote] except KeyError: _print_error( "no remote {} found, exit".format(remote) ) return COMMAND_ERROR with open(self.CONFIGFILE, "w") as configfile: config.write(configfile) self._print("Remote {} removed.".format(remote)) return 0 def _remote_show(self): """Print names of remotes. If verbose, print paths as well.""" # that might be useful to specify a remote name, # but git doesn't do that, and we won't. if self._args.verbose: for section, options in self._configdict.items(): print(section, options["destpath"], sep="\t") else: for section in self._config.sections(): print(section) if not self._config.sections(): self._print("No remotes found.") def _show(self, commits=None): """Show commit(s). Print log and difference with the previous commit for each commit. """ # commits argument is for testing if commits is None: commits = [int(commit) for commit in self._args.commit] all_commits = sorted(self._get_local_commits()) for commit in commits: if commit not in all_commits: raise ValueError( "no commit {} found".format(commit) ) # commit logs can be None commits_with_logs = self._make_commit_list(commits=commits) sync = self._get_local_sync(verbose=True) try: local_repo = self._get_repo_name_local() except YSConfigurationError: return CONFIG_ERROR for ind, cl in enumerate(commits_with_logs): commit, log = cl # print log if ind: print() self._print_log(commit=commit, log=log, local_repo=local_repo, sync=sync) # print commit commit_ind = all_commits.index(commit) if not commit_ind: print("commit {} is initial commit".format(commit)) continue previous_commit = all_commits[commit_ind - 1] self._diff(commit, previous_commit) def _status(self, check_changed=False): """Print files and directories that were updated more recently than the last commit. Return exit code of `rsync`. If *check_changed* is `True`, return a tuple *(returncode, changed)*, where *changed* is `True` if and only if the working directory has changes since last commit. """ # We don't return an error if the directory has changed, # because it is a normal situation (not an error). # This is the same as in git. try: local_repo_name = self._get_repo_name_local() except YSConfigurationError as err: if not check_changed: return CONFIG_ERROR # should return a tuple, but see no use for it in this case raise err else: self._print("In repository " + local_repo_name) if os.path.exists(self.COMMITDIR): commit_subdirs = [fil for fil in os.listdir(self.COMMITDIR) if _is_commit(fil)] else: commit_subdirs = [] # decided to leave the condition explicit in code. # def cond_print(*args, **kwargs): # """A wrapper to ignore check in every place.""" # if check_changed: # return # self._print(*args, **kwargs) ## no commits is fine for an initial commit if not commit_subdirs: configpath = os.path.normpath(self.config_dir) if check_changed: for subdir in os.scandir(self.root_dir): subdir_path = os.path.normpath(subdir.path) if subdir_path != configpath or subdir.is_file(): return (0, True) # if there is only '.ys' in the working directory, # then the repository is unchanged. return (0, False) self._print("No commits found") return 0 head_commit = self._get_head_commit() if head_commit is None: newest_commit = max(map(int, commit_subdirs)) ref_commit_dir = os.path.join(self.COMMITDIR, str(newest_commit)) else: ref_commit_dir = os.path.join(self.COMMITDIR, str(head_commit)) command = [ "rsync", "-aun", # allow incremental recursion until the implementation of # https://github.com/WayneD/rsync/issues/380 # "--no-inc-recursive", "--delete", "-i", "--no-group", "--no-owner", "--exclude=/.ys" ] filter_command = self._get_filter(include_commits=False) command += filter_command # outbuf option added in Rsync 3.1.0 (28 Sep 2013) # https://download.samba.org/pub/rsync/NEWS#ENHANCEMENTS-3.1.0 # from https://stackoverflow.com/a/35775429 command.append('--outbuf=L') root_path = self.root_dir + "/" command += ["--link-dest="+ref_commit_dir, root_path, ref_commit_dir] if not check_changed: self._print_command(command, level=3) # default stderr (None) outputs to parent's stderr sp = subprocess.Popen(command, stdout=subprocess.PIPE) # this works correctly, but strangely for pytest: # https://github.com/pytest-dev/pytest-mock/issues/295#issuecomment-1155091491 # sp = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=sys.stderr) # changed means there were actual changes in the working dir changed = False # note that directories may appear to be changed # just because of timestamps (add and remove a file), e.g. # b'.d..t...... ./\n' # b'' means EOF in the iteration. lines = iter(sp.stdout.readline, b'') for line in lines: if line: # todo efficiency: check print levels beforehand, # not for each line (as done with _print) # if not check_changed: self._print("Changed since head commit:\n") # skip permission changes if not line.startswith(b'.'): changed = True # if check_changed: # # return code is unimportant in this case # return (0, changed) # print the line and all following lines. # todo: use terminal encoding print(line.decode("utf-8"), end='') # in fact, readline could be used twice. for line in lines: if not line.startswith(b'.'): changed = True # if check_changed: # return (0, changed) print(line.decode("utf-8"), end='') sp.wait() # otherwise returncode might be None # None is fine for sys.exit() though, # because it will be converted to 0. # For testing, it is better to have it 0 here. returncode = sp.returncode commit_limit = self._get_commit_limit() if commit_limit is not None: self._print("\nMaximum number of commits is limited to {}"\ .format(commit_limit)) if head_commit is not None: self._print("\nDetached HEAD (see '{} log' for more recent commits)" .format(self.NAME)) if os.path.exists(self.MERGEFILE): with open(self.MERGEFILE, "r") as fil: merge_str = fil.readlines()[0].strip() merges = merge_str.split(',') self._print("Merging {} and {} (most recent common commit {})."\ .format(*merges)) if not changed and not check_changed: self._print("Nothing to commit, working directory clean.") if changed: # and not check_changed: # better formatting self._print() sync = self._get_local_sync(verbose=not check_changed) if sync and not check_changed: # if we only check for changes (to push or pull), # we are not interested in the commit synchronization status commits = list(self._get_local_commits()) last_commit = self._get_last_commit(commits) if last_commit in sync.by_repos.values(): last_repos = sync.get_synced_repos_for( last_commit, exclude_repo=local_repo_name ) self._print("\nCommits are up to date with {}."\ .format(", ".join(last_repos))) else: synced_commits = sync.by_repos.values() if synced_commits: last_synced_commit = max(synced_commits) n_newer_commits = sum([1 for comm in commits if comm > last_synced_commit]) last_repos = sync.get_synced_repos_for( last_synced_commit, exclude_repo=local_repo_name ) self._print("Local repository is {} commits ahead of {}"\ .format(n_newer_commits, ", ".join(last_repos))) # called from an internal method if check_changed: # changed is always False here return (returncode, changed) # called as the main command return returncode def _update_head(self): try: # no HEADFILE means HEAD is the most recent commit os.remove(self.HEADFILE) except FileNotFoundError: pass def _write_repo_name(self, reponame, verbose=True): # todo: if the path contains {}, it can lead to an error repofile = self.REPOFILE.format(reponame) if verbose: self._print("# create configuration file {}".format(repofile)) with open(repofile, "x"): pass # return full path to the repository file return repofile def _write_sync(self, sync, print_level=3): if sync.new or sync.removed: self._print("updating synchronization ... ", end='', level=print_level-1) if print_level <= self.print_level and (sync.removed or sync.new): print() for sync_str in sync.removed: self._print(" remove", sync_str, level=print_level) os.remove(os.path.join(self.SYNCDIR, sync_str)) if sync.new and not os.path.exists(self.SYNCDIR): if print_level <= self.print_level: print() self._print_command("mkdir {}".format(self.SYNCDIR), level=print_level-1) os.mkdir(self.SYNCDIR) for sync_str in sync.new: self._print(" create", sync_str, level=print_level) with open(os.path.join(self.SYNCDIR, sync_str), "x"): # just create this file pass # we might write sync twice if we encounter an error # (then we remove the wrong repo from sync) sync.new = set() sync.removed = set() self._print("done", level=print_level-1) def __call__(self): """Call the command set during the initialisation.""" try: # all errors are usually transferred as returncode # and functions throw no exceptions returncode = self._func() # in Python 3 EnvironmentError is an alias to OSError except OSError as err: # In Python 3 there are more errors, e.g. PermissionError, etc. # PermissionError belongs to OSError in Python 3, # but to IOError in Python 2. _print_error(err) returncode = 8 # in case of other errors, None will be returned! # todo: what code to return for RuntimeError? return returncode def main(): # parse arguments try: ys = YARsync(sys.argv) except (argparse.ArgumentError, argparse.ArgumentTypeError, YSArgumentError, YSUnrecognizedArgumentsError): ## Argparse error ## # rsync returns 1 in case of syntax or usage error, # therefore we use the same code # (rsync is never called during __init__). # the error message is printed by argparse. sys.exit(SYNTAX_ERROR) except (OSError, YSConfigurationError): ## ys configuration error ## # (not in a repository, configuration file missing, etc.) # the error is printed by YARsync sys.exit(CONFIG_ERROR) except SystemExit as err: ## Some runtime error ## # SystemExit can be 130 for python # and 1 for pypy for KeyboardInterrupt. # Since this is interpreter-dependent => unreliable, # we don't capture and return it. # Moreover: we guarantee that our error code does not interfere # with real rsync error codes (during the __init__). if err.code == 0: # normal argparse exit. For example, --help. sys.exit(0) else: sys.exit(SYS_EXIT_ERROR) except YSCommandError: sys.exit(COMMAND_ERROR) # make actual call try: # should this throw exceptions (_clone) # or return a non-zero code (_pull_push)? returncode = ys() except YSCommandError: sys.exit(COMMAND_ERROR) sys.exit(returncode) if __name__ == "__main__": main()